Skip to main content
Category

Blog

Dive Into R: Collaborate with the Indy UserR Group as a Newsletter Contributor!

By Blog

R Consortium recently talked to Sam Parmar of the Indy UseR group about using R in public health and pharma in Indianapolis. Sam also spoke about his volunteer work with the R Weekly newsletter. The newsletter aims to provide subscribers with a comprehensive list of the latest R resources from around the web. Sam also discussed his short online book with tips and tricks for using AI assistant tools such as ChatGPT or GitHub Copilot. 

Please share about your background and involvement with the RUGS group.

I am a Statistical Data Scientist at Pfizer and a member of the RCoE SWAT team, which stands for Scientific Workflows and Analytical Tools. We consult with other Pfizer business lines to provide technical expertise on using the R programming language and its various packages. We are also building a community around using R, which has over 1000 members at Pfizer.

From 2017 to 2020, as an epidemiologist for a local health department, I came across R, which we used in conjunction with SAS. This is how I first got involved with the R community. I used to read the R Weekly newsletters and started participating in the Indy UseR group, which Shankar Vaidyaraman and Derrick Kearney organized. It helped me discover many new tools and grow my network. I used R for a few years as an epidemiologist before working in the Pharma sector.

In my experience with the IndyUseR group, the R community is very welcoming. R has excellent documentation, including Quarto books and R Markdown, which makes learning the language easier. I love that many people are willing to create excellent free resources and present their work. We’ve been lucky to have the creators of the gt and targets packages present at our previous user group meetings. I wouldn’t have been able to change careers without this welcoming community successfully.

Can you share what the R community is like in Indianapolis? 

The community is diverse, with folks from many backgrounds, such as pharma, public health, and academia. Seeing new faces in our user group meetings this year has been great. Many recent graduates and students attending our meetings are familiar with the tidyverse. At a meeting earlier this year, we connected with a professor teaching analytics at a local university. We enjoyed learning his perspective on educating students in R.

Please share about a project you are currently working on or have worked on in the past using the R language. Goal/reason, result, anything interesting, especially related to the industry you work in?

As I stated previously, I was an avid reader of the R Weekly newsletter. I joined the curation team that generates these issues online just over a year ago. We have an engaged community of readers and listeners and a podcast accompanying the newsletter.

R Weekly logo

R Weekly is an open source project launched in 2016 and is still actively maintained today. This achievement is remarkable, as very few open source projects last and are actively maintained for a long time. A volunteer curation team oversees the publication of R Weekly. We aggregate information from various RSS feeds, as well as contributions via pull requests made directly to the GitHub repository we host. Our content is public, so it is possible to view historical issues and suggestions we have integrated into the platform. 

It is an amazing resource, and we are looking for more members on our team. So, if anyone reading this interview is interested, please submit resource links or do a few pull requests and join our team. Fill in this form to join our team.

Another thing I’m working on is a short tips and tricks book on using AI assistant tools for programmers. This book aims to guide anyone interested in integrating tools like ChatGPT or GitHub Copilot into their workflow and setting up guidelines. It’s not perfect, but I wanted to share it with the community to generate interest and get support from anyone interested in contributing to the book. 

What trends do you currently see in R language and your industry? Any trends you see developing in the near future?

There is a lot of interest in Shinylive for R and Python in the pharma space. The developments in WebR technology are truly amazing. I’m involved in a Submissions Working Group Pilot that is looking into the use of WebR and Shinylive for regulatory submissions. Hopefully, in the future, we’ll see it being used for submissions. Additionally, the integration of GitHub Copilot in the RStudio IDE is an exciting new release that I think many people are looking forward to.

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups worldwide organize, share information, and support each other. We have given grants over the past four years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute.

Pfizer Leaders Discuss the Adoption of Open Source, Predominantly R, in Clinical Trial Reporting

By Blog

In this PHUSE video interview, Pfizer is making strides in incorporating open source tools into its clinical reporting processes. Michael Rimler, PHUSE Open Source Technologies Director, brings in Mike Smith, Senior Director at Pfizer, head of its R Center of Excellence, and R Consortium board member, and Patti Compton, Vice President and head of Statistical Data Science and Analytics at Pfizer, to discuss Pfizer’s open source journey. 

While Pfizer has used R for many years, primarily in clinical pharmacology and QA metrics, the company is now looking to transition more of its clinical trial reporting to be predominantly R based within the software development lifecycle. Smith notes some challenges of “changing the wheels on the bus while it’s rolling down the highway,” but both leaders emphasize the importance and benefits of collaboration through open source. 

Patti Compton sees opportunities to build Pfizer’s data science community, leverage the rich library ecosystem, and gain platform independence to take time out of drug development timelines. Both envision a future where submissions to regulators use interactive data tools to allow deeper exploration of clinical trial results.

Smith predicts more interactive platforms between industry and regulators for activities like label negotiations within the next two to three years. Compton agrees that open source will be applied in new ways across drug discovery and development and expects expectations for data science skills to increase. The interview provides insight into Pfizer’s motivations and goals for advancing open source adoption within clinical reporting and analytics.

You can watch the full interview here.

R Quixote in Spain: Organizing Annual R Conference and Writing Book for Hispanic R Users

By Blog

The R Consortium recently had the opportunity to talk to Isidro Hidalgo Arellano and Gema Fernández-Avilés Calderón of the R Quixote user group in Spain. They discussed the R for Business, Teaching, and Research Conference (R4EDI) organized by the group annually. The conference acts as a bridge between students, researchers, and industry in Castilla-La Mancha. Each year, the group invites prominent data scientists from all over Spain to speak at the conference.

The group is also writing a book titled Fundamentals of Data Science with R in Spanish. The book is being written by experts from a diverse range of backgrounds to provide a comprehensive learning resource for Spanish R users. 

 Isidro Hidalgo Arellano, Head of Section at the Obs. of the Labor Market Castilla-La Mancha Community Board
Gema Fernández-Avilés Calderón, Full Professor, Director of Masters in Data Science and Business Analytics program University of Castilla La-Mancha

Please tell us a little bit about your R User Group.

Gema: We are a relatively new R User Group, and our name is R Quixote. We started in February 2021 with eight group members, and we now have more than 50 members. Our original scope was our region, Castilla-La Mancha, but currently, we have members from different parts of Spain, like Madrid and Murcia. We also have members from other countries like Argentina and Colombia. The majority of our members are from the University of Castilla-La Mancha. We also have members from different universities like the University of Alcala, the Complutense University of Madrid, and the National Distance Education University, and private and public companies like the Government of Castilla-La Mancha, and CEOs of private companies like Analitycae, Okuant, or MRC Consultants.

We also have strong collaborative ties with R user groups across the region. We collaborate with Emilio López Cano, the president of R-Hispano, the R National group. We have also organized events with Aurora González Vidal, the president of UMUR.

Our user group collaborates with the University of Castilla-La Mancha in the Master of Data Science & Business Analytics (with R software). The program is completely online and taught in R. The agreement between R Quixote and UCLM allows a 10% discount on the tuition fee to the R Quixote members. 

Isidro: I have designed the group logo, and it features the helmet of Mambrino, which is really famous here in the Quixote region. The lead character of the novel Don Quixote by Miguel de Cervantes inspires it. It also features a silhouette of a windmill, again a reference to the novel. I have improvised a little and used an electric windmill instead of a traditional windmill. We have a webpage, but some sections are under construction. We are also on Twitter, but we are not very active there.

You held the 3rd R4EDI Conference from 19-20th October 2023. Can you share the details?

Gema: We host an annual event called the R4EDI Conference (R para Empresa, Docencia e Investigación; R for Business, Teaching, and Research). Our first conference was in Toledo in June of 2021. Toledo is a very beautiful city in Castilla-La Mancha. It was a Hybrid event because of the pandemic. At this event, we had three important guest speakers: Emilio López Cano, President of R-Hispano (the most important R local group of Spain), Aurora González Vidal (President of UMUR, the R local group of the region of Murcia), and Xavi de Blas (Senior Software Developer at Chronojump).

We organized the second conference in Pastrana. On this occasion, we had the participation of several experts from the INE (Statistical Agency of Spain), the UNED (Universidad Nacional de Eduacación a Distancia; National University of Distance Education), and some enterprises such as Analitycae and Okuant.

We began hosting this conference as a physical event, and it is a great place to meet people from business, teaching, research, and government to share ideas. This year, we organized the event in Almagro from 19-20th October 2023. Almagro is a beautiful city from Ciudad Real. We invited speakers like Mari Luz Congosto. She is a very prominent engineer from Telefonica; she’s currently teaching at the University Carlos III of Madrid. Borja Andrino from El País newspaper also spoke at the event. María Jiménez, a senior researcher from La Paz-IdiPAZ, also gave a talk. La-IdiPAZ is a national hospital located in Madrid. Here is a photo of the group:

3rd R4EDI Conference, Almagro

The 3rd R4EDI was our main event, and we were delighted to provide a networking platform to businesses, students, and researchers. This year, Julián Garde, the rector of the University of Castilla-La Mancha, inaugurated the conference. 

Any techniques you recommend using for planning for or during the event? 

“Sending Emails with SendGrid R Package” Event, May 2023

Isidro: I want to discuss an interesting package that can be really helpful for RUG organizers. I also gave a workshop on the use of the SendGrid R package. It allows you to create prototypes of certificates, for example, certificates of participation or certificates for conference organizers. It is really useful because we now have the code for these events, and you can send all the conference certificates in no time. This package can be very useful for RUG organizers. 

Please share about a project you are currently working on or have worked on in the past using the R language. Goal/reason, result, anything interesting?

“Fundamentals of Data Science with R” Book Cover

Gema: We are working together on writing a book for the Hispanic R community. Almost 50 people are working on this book, most of them researchers. The title of the book is Fundamentos de Ciencia de Datos con R (Fundamentals of Data Science with R). The main motivation behind writing this book was that Spanish is among the most spoken languages in the world. Yet, there are not many good and comprehensive data science books written in Spanish. We wanted to write a really good book in Spanish with statistics fundamentals, not only with machine learning but also deep learning and text mining. We also have a section dedicated to special data science and a very important part of communication with Quarto and R Markdown. We also have 13 case studies from different areas like soccer, health, marketing, unemployment, finance, and retail. 

CDR Package Logo

McGraw Hill will publish this book, and we will have an open webpage. We also have our own R package CDR,  with all datasets used in the book that are not free on other websites. Isidro has also designed the logo for the package. We are very proud to announce that Julia Silge will be writing the prologue for the book. 

PHUSE Connect EU 2023 –  Clinical Data Science Conference – Coming in Early November

By Blog, Events

Blog post for R Consortium partner PHUSE

Join PHUSE in Birmingham, UK, from November 5-8 for an event that promises to energise your learning through connection with the PHUSE Community and hearing from our experts on the topics that are most important to industry today.

The R Consortium is participating directly in one session on Tuesday, November 11, starting at 11am local time. Dive deep into the world of open source in pharma by joining a panel discussion that includes Mehar Pratap Singh, Chairman of the Board of Directors at the R Consortium, Director Sumesh Kalappurakal, and other open source representatives entitled “Let’s Discuss Open Source Openly: A New Path in Pharma.” This is an excellent opportunity to engage in a meaningful discussion and gain insights from leading voices in the industry. 

One of the highlights of this event will be “Let’s Discuss Open Source Openly, A New Path in Pharma,” a session led by Director Sumesh Kalappurakal, Mehar Pratap Singh, Chairman of the Board of Directors at the R Consortium, and other open source representatives. Dive deep into the world of open source in pharma, understanding its implications and the potential it holds for the future. This is a golden opportunity to engage in meaningful discussions and gain insights from a leading voice in the industry. 

It’s not too late to register your place or secure a sponsorship opportunity! View full information via the links below and explore the agenda to see what’s in store for attendees.

Agenda: https://phuse.s3.eu-central-1.amazonaws.com/Events/2023/EU+Connect+%E2%80%93+Birmingham/Agenda/EU+Connect+Agenda.pdf 

Event Information: https://www.phuse-events.org/attend/19/home 

Sponsorship: https://phuse.s3.eu-central-1.amazonaws.com/Events/2024/Connect+Prospectus.pdf 

Ted Laderas Discusses CascadiaR and the Diverse R Community in Portland

By Blog

Ted Laderas of the Portland R User Group shared his experience of pioneering the Cascadia R Conference for the Pacific Northwest and the West Coast. The conference is now in its sixth year and hosted its first in-person event post-pandemic this year. He also discussed the vibrant R community in Portland which provides a rich learning environment due to its diverse nature. 

Ted is a bioinformatics trainer at DNAnexus and was formerly an Assistant Professor at the Oregon Health and Science University. He is also a certified instructor for The Carpentries and Posit Academy

Please share your background and your involvement in the RUGS group or in the R Community.

I have been using R for almost 20 years, primarily for bioinformatics and computational biology work. One of my earliest projects was clustering microarray data. I built a package to compare all the different clustering methods on microarray data, which was my Master’s thesis. I believe I spent a lot of that time being unproductive in R because I found it difficult. I started getting involved with the R community in 2017 when I joined the Portland R User Group. As a faculty member at a university, people realized that I had access to spaces that could host a conference. I co-organized a conference called Cascadia R when I first joined the R community. Since then, I have been active in the Portland R user group and with other open science and teaching groups like The Carpentries.

Can you share what the R community is like in Portland? 

I believe we are fortunate to have a diverse R community in Portland and Oregon. I would estimate that 50% of the community is academic, 25% is government, and 25% is industry. This has created a pleasant mix in the R user group, as people have experience in all of these different areas. For example, one of our organizers, Marley, does a lot of work in economics and spatial analysis. Brittany, another one of our co-organizers, does a lot of work in forestry and ecology. John works for a spiritual/meditation group and does a lot of work with people data.

Portland R is a really interesting mix of people, and we have learned a lot from each other. This is because the community is not just focused on academia, and there are people with all of these different backgrounds. We also have an event format where we help each other out in an intimate setting. People come in with their questions and issues, and we all try to solve them.

Please share about a project you are currently working on or have worked on in the past using the R language. Goal/reason, result, anything interesting, especially related to the industry you work in?

I would like to discuss the Cascadia R‌ Conference, which is now in its sixth year. The Portland R user group organized the conference in the beginning, but we have not been involved with the last couple of conferences. The conference was started because we wanted to have a conference for the Pacific Northwest and the West Coast. We had six co-organizers, and we all had the kind of expertise that came together to make it a success. None of us had experience organizing a conference, but the other members of Portland R were enthusiastic about it. So, we decided to give it a go.

The conference was a great success, and it has become a major event for the R community in the Pacific Northwest and the West Coast. It is a great opportunity for people to learn about R, network with other R users, and present their research. I am proud to have been a part of the conference, and I look forward to seeing it continue to grow and succeed in the years to come.

The first year was quite challenging, as we had only two and a half months to prepare. However, we were able to attract 250 people to Portland to discuss R, give talks, and participate in lightning talks. We also held a couple of workshops in the first two years. Since becoming involved with Cascadia-R, I have been involved in the R community, primarily through teaching. I have helped with workshops and worked with other conferences such as R Medicine and R Pharma. It has been very enjoyable to be involved in the community.

What trends do you currently see in R language?

I believe that there is always interest in anything related to the tidyverse within our group, as everyone wants to learn how to work optimally with their data. It has always been a fascinating topic for people. We have a lot of people who are skilled in spatial analysis, so there has been a lot of interest in that area. Especially with packages like tidycensus, working at the census track level has been very popular. We are also always eager to hear about the latest news on Shiny.

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups worldwide organize, share information and support each other. We have given grants over the past four years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute.

satRdays infrastructure update – ISC Funded Project

By Announcement, Blog

satRdays are R-focused conferences that are held on Saturdays. They happen all over the globe, and are organized by local R community organizers to help and grow the local community.  

The R Consortium is revamping satRdays and we are building a new satRdays website template that  enables satRday organizers easily spin up a website. Formerly, the satRdays website template was  based on Hugo, and this creates an extra hurdle for R community organizers who are novices to Hugo  seeking to spin up a website for a potential satRday event.  

The new template is based on Quarto, and the output is HTML which can be easily hosted on GitHub  Pages, Netlify, etc. The GitHub repository for this project is a template repository and therefore, a  satRday organizer can create their satRday website GitHub repository using this template. The main goal of the template is make it very easy and quick to launch a satRday website in a very  minimal number of hours. This means adding components like speakers, schedule and even customizing the countdown would be so easy as editing the related YAML files (e.g.  https://github.com/satRdays/quarto-satrdays-template/blob/main/program/speakers/speakers.yml)  

The project is developing and we want to hear what the R Community thinks about this new  development, receive feedback and progress on the work. The GitHub repository for the project is here: https://github.com/satRdays/quarto-satrdays-template and a preview of the template is available too:  https://satrdays.github.io/quarto-satrdays-template/ 

useR! 2024 is already using this template for the useR! 2024 conference’s website

Please we would like to get your feedback (issues, feature requests, etc) in the issues section.

Giving Back to the R Community in Edmonton, Alberta, through Open Source Package Development and Educational Blogs

By Blog

Peter Solymos of the Edmonton R User Group (Yegrug) recently spoke to the R Consortium about the growing acceptance of R in the industry in Edmonton. He also discussed his professional journey with R and the challenges of organizing an R User Group in the post-pandemic era. Peter actively contributes to the R community through his work as an open source package developer. He also maintains a blog about hosting data apps, aiming to provide all the required information in one place. 

Peter proudly holds someone else’s Oscar while thinking: winner, winner, chicken dinner

Peter is an ecologist and Data Scientist with a PhD in Biological Sciences from the University of Debrecen, Hungary. He currently works as a Senior Data Scientist at E Source.   

Please share your background and your involvement in the RUGS group or the R Community.

My background is in ecology, and I used to study animals in the field. Then, I became interested in multivariate statistics and started using R. At that time, I was teaching at the University of Veterinary Medicine in Budapest, Hungary. All the students were learning R as a part of their stats course. I felt a bit behind because I was still working with Excel. So, I had to upskill and learn R, which was beneficial as it helped me secure a job in Canada.

I’m also a package developer, and I have published many statistically focused packages. I have also developed some non-statistical packages, like how to unify progress bars in R, which is my most downloaded package now. I love developing packages as an open source developer and focus on interesting questions. In my day job, I am a senior data scientist working with utility companies to assess and mitigate risk. On the side, I am also involved in the R community and doing a bit of consulting here and there. I also like developing Shiny apps.

I got involved with R because I needed it for my personal development. In academia, I haven’t met anyone who is not using R or is not at least familiar with it. We started the R user group at the University of Alberta in 2012. It took a hiatus for a couple of years, and we restarted it in 2021 after the pandemic. Last year, we hosted seven meetups, and this year we just started. You can find more about the Edmonton R User group events and links to past recordings on our GitHub.

I think after the pandemic this synchronous way of meeting – especially if it is online first – is becoming a challenge. The reason I am saying this is that sometimes very few people show up at the physical events and it’s not because they are not interested, but they think of it like a YouTube video that they can watch later on in their own time. This is an interesting change and I am not saying it is bad. I think this is expected and I would do the same in their position. But as an organizer, it puts you under pressure to make the events more interesting and to be able attract more attendees. This might just be the new way of life post-pandemic that we have to accept. Maybe we should talk about interesting topics with those who show up for the events and then the rest can watch it later.

Montage from the Edmonton R User Group’s past meetups.

Can you share what the R community is like in Edmonton? 

In Edmonton, the academic community is utilizing R in various fields. Ecologists are using it for their conservation-related or resource management questions. Alberta is a big province with a comparatively small population. It’s mostly remote and covered with forests. And in that forest, there is a lot of resource development going on, for example, oil and gas, forestry, mining, etc. It is really important to understand the effects, and there is a lot of spatial data analysis going on and R is well-suited for that. So this is one area that I know particularly well. 

There has been a growing acceptance of R in the industry. In the past, companies only wanted their employees to work with Python. Right now, companies don’t mind which language you used to get your job done and R is a part of these languages. This change is fortunate for R developers and data scientists in how the industry approaches their data science stack. In government and health sciences, R is being heavily used and a lot more prevalent than Python. SAS and other tools were more common in the past in these areas and right now R is the dominant tool. 

Please share about a project you are currently working on or have worked on in the past using the R language. Goal/reason, result, anything interesting, especially related to the industry you work in?

A few years after Shiny was introduced, I started playing around with it and got into app development, mostly for teaching. In the beginning, we had some workshops, and it was a cool way to demo things without having to write code. Later, I started using it for consulting and I started looking for ways to host apps so that they are not visible to everyone as I built them for a client.

There are various ways to host Shiny, like Shiny apps. It is nice to begin with, but people tend to outgrow it at some point if they have particular requirements. In 2017, I started using ShinyProxy which was almost brand new. ShinyProxy is used to deploy Shiny and other apps in a dockerized environment. It lets you authenticate and authorize users to allow them to see the information you want them to see. As time went by, ShinyProxy has developed and I have gotten better at working with it. 

I believed this was common knowledge because I could find the information and learn it myself. But at some point, I realized that people might struggle with the setup because all the information is pretty scattered on the Internet. Usually, you find tutorials that have a narrow focus. So I started writing blog posts about Shiny hosting and I think now it is past 60 posts. I titled the blog Hosting Data Apps. I started following a table of contents that I had in mind at that point. What is Shiny? What are the various ways of hosting it, and how can you learn how to use Docker or ShinyProxy? How to set up a custom domain or HTTPS, and authentication, or how to scale your apps. People have been interested, and we have received very positive feedback, which has led to a book deal with CRC Press for me and my co-author Kalvin Eng.

This coming year, we are going to focus on writing this book titled “Hosting Shiny Applications for R and Python.” We are going to cover questions like how do you pick the best hosting option? Once you pick the best option for your use case, how can you implement it, and what are the considerations you need to think about?

I find it interesting that it is so easy to learn Shiny these days. I know people who have never touched it but are good R programmers and could learn it in a few days. But once you want to share your app, eventually you start thinking about what else you can do. It’s hard to find everything in one place. This is what motivated me to start writing the blog with a book in mind. And right now we are getting closer to the end ‌goal. 

What trends do you currently see in R language?

I see a huge interest in cross-language reproducibility like Quarto and R Markdown. There is an interest in how to build beautiful things for the web as products, books, and websites and how to make everything reproducible. Even though Quarto’s ecosystem is not as well developed as knitr and R Markdown, people are using it for everything. They don’t care about the lack of tools, and that’s how the ecosystem is developing. If something is missing, they just figure it out and now it exists, and everyone can use it. I think this drive that if I can’t find something, I will build it, is what’s taking the R community forward. This approach is what makes the community so valuable and welcoming.

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups worldwide organize, share information and support each other. We have given grants over the past four years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute.

R Conference 2023: Malaysia’s Largest Face-to-Face Annual R Conference 

By Blog

The R Consortium recently reconnected with Poo “KH” Kuan Hoong, founder of the Malaysia R User Group (Also on Facebook). We talked with KH last year when he spoke about Hosting Malaysia’s Largest Annual R Conference. KH is deep into planning and preparation for this year’s R Conference, the first time it will be held face-to-face in two years! R confeRence 2023 is Malaysia’s largest physical R User Conference. It features Malaysian industry leaders and academicians. The conference will be held at Sunway University on October 28, 2023. Please come and join the Malaysia user group if you are near Subang Jaya, Malaysia. You will learn, explore, and more.

What’s new with the Malaysia R User Group (MyRUG)?

Since our last chat, we’ve expanded our outreach by collaborating with Malaysia R-Ladies. Initially, there was only one R user group in Malaysia. However, we felt the need to support and inspire our female participants. So, we established the R-Ladies chapter for Malaysia.

Our annual R ConfeRence is ongoing. For the past two years, the pandemic forced us to host it online. But this year, we’re thrilled to announce it’ll be a physical event at Sunway University. At the moment, we’re planning for 15 talk sessions, each spanning around 40 to 45 minutes, including a Q&A. Furthermore, we’ve scheduled five hands-on workshops, each about one and a half hours.

The event is set for the 28th of October, and registration is open! You can register here!

One challenge we’ve contemplated is the shift from online to a physical format. During online events, it was easier to involve international figures or speakers. But with our event now taking place in Kuala Lumpur, the capital of Malaysia, travel becomes a concern. Thankfully, we have some generous sponsors and consultants helping us, including help from the R Consortium. We’re offering travel and accommodation allowances for our speakers, ensuring they can join and present in person.

At the R confeRence 2023 this year, who is speaking, and what is the focus?

 Malaysia R User Group in confeRence 2019

Regarding this year’s topics and speakers, we’re focused on four main themes:

  • Machine Learning: An exploration of R’s capabilities in the realm of predictive analytics.
  • Epidemiology and Medical Statistics: We’re particularly excited about this, as we’ve partnered with a prominent medical school in northern Malaysia. This theme will feature speakers primarily from public health backgrounds who utilize R for their analyses.
  • Data Visualization: R is renowned for its data visualization strengths, and this theme will delve into the latest techniques and best practices.
  • Data Management and Deployment: The final theme will address the challenges and solutions in handling and deploying data using R.

These are our central themes, and we’re thrilled to have a diverse group of speakers representing each of them this year.

Regardless of the industry someone hails from, attending offers a unique chance to gain insights, network with R professionals, and learn from the best in the field.

Please share about a project you are currently working on or have worked on in the past using the R language. Goal/reason, result, anything interesting, especially related to the industry you work in.

I’m currently a senior manager at British American Tobacco (BAT). In the data analytics department, our primary focus is on Revenue Growth Management (RGM). One of my personal projects is developing a Marketing Mix Modeling (MMM) tool. Essentially, this tool analyzes promotions across various channels and calculates the return on investment (ROI) from marketing expenditures for each channel.

I’ve chosen to build this tool using R because it offers powerful libraries, particularly for time series analysis. I’m also integrating various R packages into the tool to enhance its capabilities. Once I’ve completed this tool, my goal is to present it to my superiors as a valuable resource for assessing the effectiveness of our marketing investments across different products and channels in the company.

What resources/techniques do/did you use? (Posit (RStudio), Github, Tidyverse, etc.)

I’m increasingly relying on cloud platforms for my work. For instance, I frequently use Google Colab because it’s a convenient environment to execute R code. It offers the advantage of running code without any local installations, which is highly beneficial, especially when giving presentations.

Before this, I was using RStudio Cloud, now renamed RStudio Workbench.

Two months ago, I gave a talk on the mlr3 R package. This package addresses a challenge in R: the presence of multiple packages with overlapping functionality. In the realm of Python machine learning, there’s a standard library called scikit-learn, which provides comprehensive end-to-end machine learning tools. I believe R needs something similar, and that’s where the mlr3 library shines. It’s a unified package for training machine learning models in R, and I introduced it in my talk.

When I deliver talks, I choose Google Colab as my go-to platform, primarily because it’s stable and user-friendly. I simply share a Colab link with participants, allowing them to run and interact with the code directly in their browsers without any installations.

For reference, I can share the recording of my “mlr3” presentation along with the associated Google Colab notebook:

Is this an ongoing project? Please share any details or ways for someone to get involved!

Currently, it’s still in progress and not ready for public sharing. Once it’s complete, I’ll provide the GitHub link. Our goal is to make it open source, allowing others to view and contribute.

What trends do you currently see in R language and your industry? Any trends you see developing in the near future?

In our data analytics industry, many professionals are gravitating towards Python. It’s essential to continuously demonstrate R’s capabilities, emphasizing that it can perform many tasks at which Python excels. We should leverage the latest technologies that spark excitement and interest to showcase the benefits of using R. It’s a user-friendly language that provides machine learning and production-grade application development tools.

In the past, we’ve demonstrated R’s compatibility with TensorFlow, PyTorch, and other tools. While Python has strengths, one can still achieve impressive results with R without delving deep into complex programming. 

Looking ahead, our objective should be to ensure no noticeable difference in applications developed using either Python or R. The workflow should be consistent regardless of the chosen language. For instance, the Shiny platform now supports both R and Python and regardless of which language it’s written in, the output remains consistent in appearance and functionality. The key is achieving uniformity across platforms and results.

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups worldwide organize, share information and support each other. We have given grants over the past four years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute. We are now accepting applications! 

Shiny App Successfully Reviewed by FDA CDER Staff (Pilot 2 Announcement 2)

By Announcement, Blog

The R Consortium is pleased to announce that on Sept 27, 2023, the R Submissions Working Group successfully completed the follow-up to the pilot 2 R shiny based submission and received FDA CDER response letter! 

To our knowledge, this is the first publicly available submission package that includes a Shiny component. 

The full response letter can be found at: https://github.com/RConsortium/submissions-wg/blob/0f1dc5c30985d413f75d196c2b6caa96231b26ee/_Documents/Summary_R_Pilot2_Submission%2027SEP2023.pdf  All submission materials can be found at:  https://github.com/RConsortium/submissions-pilot2-to-fda

The objective of the R Consortium R submission Pilot 2 Project is to test the concept that a Shiny application created with the R-language can be bundled into a submission package and transferred successfully to FDA reviewers. 

The initial submission was submitted through the eCTD gateway on Nov 18, 2022. FDA verbal responses were received from Jan-June 2023 during R submission working group meetings. The FDA response commented on minor findings on warnings and recommended best practices on filters, documentation and package source. The updated submission package addressed reported issues and was submitted on July 19, 2023. The final response letter from FDA was received on Sept 27, 2023.

As a next step, the R Consortium R Submission Working Group initiated submission pilot 4, to explore the use of novel technologies such as Linux containers and web assembly to bundle a Shiny application into a self-contained package, facilitating a smoother process of both transferring and executing the application.

The initial announcement of the R Consortium R Submission pilot 2 can be found at: 

Announcement of the R Consortium R submission pilot 1:

Empowering Healthcare with R: Javier Orraca-Deatcu’s Journey from Finance to Predictive Health Models

By Blog

Javier Orraca-Deatcu of the Southern California R User Group (SoCal RUG) highlighted his work at a health insurance company for quality of life improvements through data science models. He uses R for predicting health issues in Medicare and Medicaid populations and alerting their care management teams about preventive measures and lifestyle changes to prevent these issues. He shares in-depth insight into how R has the potential to impact the lives of the elderly population. He also advises R users to start their own R user groups within their companies to expand their network.

Javier has a Bachelor’s degree in Management from the Georgia Institute of Technology. He also holds a Master’s in Business Analytics from the University of California, Irvine (UCI). He works as the Lead Machine Learning Engineer at Centene Corporation. Javier co-organizes the SoCal RUG and is also a co-founder of the Centene R Users Group. 

Please share about your background and your involvement in the R Community. What is your level of experience with the R language?

My traditional background before I pivoted into data science was financial modeling. I was doing valuations and assisting with the lifecycle of mergers and acquisitions. At the time, I was maximizing the capabilities of Microsoft Excel and Microsoft Access. Around 2015, I started hearing chatter about “data science” and became interested in how to augment my financial forecasts and simulations with code. I quit my job, returned to graduate school, and was able to dedicate all of my time to learning about business analytics, data science, and R.

The Southern California R Users Group conducts an annual hackathon, which they host in collaboration with the UCI. I was enrolled in the Master of Business Analytics program at the UCI, and I joined the hackathon with a team of classmates. That’s when I met the organizers of SoCal RUG, and it was great to network with like-minded members of the data science community.

What industry are you currently in? How do you use R in your work?

I work for a healthcare insurance company, Centene, and it was pretty exciting to see us become a Fortune 25 corporation earlier this year. Our core business is to manage care insurance products covering mostly government-sponsored healthcare insurance like Medicaid and Medicare. We are fortunate to have robust access to health data since we manage our members’ care. As a data scientist and MLE at Centene, it’s exciting that I can apply modeling techniques to support quality-of-life improvements, such as early disease identification.

Why do industry professionals come to your user group? What is the benefit for attending?

The big appeal is to network with a very diverse group of professionals in many different industries. The opportunity to come together and brainstorm problems is invaluable. We provide technical workshops and monthly meetups with new information to learn from our presenters and passionate members of the R community. 

What trends do you currently see in R language and your industry? Any trends you see developing in the near future?

I’m doing less data storytelling these days with interactive visualizations and web apps, but the evolution we are seeing with Shiny and new opinionated Shiny frameworks has been a delight. Shiny is quickly maturing as a web app framework, and it’s becoming, in my opinion, a go-to enterprise web app framework with more and more recognition. In addition, the tidyverse and tidy syntax have made it much easier for non-programmers (people without a computer science or software engineering background) to adopt R for their business needs.

Throughout grad school and professionally, I have had exposure to several other languages, including Python and Julia. But for data analysis and manipulation, it’s just so easy to do with R. The rate at which the tidymodels framework matures has also been impressive. This modeling framework acts as a sort of meta-engine that helps data scientists develop reproducible, end-to-end machine learning workflows and pipelines. The ability to develop a modeling workflow purpose-built for fast iteration, consumption at scale, and production are all big wins for corporate data scientists and MLEs.

Anything else you would like to share with R Users around the globe?

For R enthusiastic professionals who regularly educate their coworkers about modern use cases for R. I recommend starting an internal R user group at your company — it’s a great way to network internally and share knowledge. I helped start the Centene R Users Group in 2019, and in six months, I saw our monthly meetups grow from less than 10 people to over 100 participants. I’ve met a lot of new teams and groups that I would have otherwise never had the opportunity to come in contact with.