Skip to main content
All Posts By

R Consortium

satRdays infrastructure update – ISC Funded Project

By Announcement, Blog

satRdays are R-focused conferences that are held on Saturdays. They happen all over the globe, and are organized by local R community organizers to help and grow the local community.  

The R Consortium is revamping satRdays and we are building a new satRdays website template that  enables satRday organizers easily spin up a website. Formerly, the satRdays website template was  based on Hugo, and this creates an extra hurdle for R community organizers who are novices to Hugo  seeking to spin up a website for a potential satRday event.  

The new template is based on Quarto, and the output is HTML which can be easily hosted on GitHub  Pages, Netlify, etc. The GitHub repository for this project is a template repository and therefore, a  satRday organizer can create their satRday website GitHub repository using this template. The main goal of the template is make it very easy and quick to launch a satRday website in a very  minimal number of hours. This means adding components like speakers, schedule and even customizing the countdown would be so easy as editing the related YAML files (e.g.  https://github.com/satRdays/quarto-satrdays-template/blob/main/program/speakers/speakers.yml)  

The project is developing and we want to hear what the R Community thinks about this new  development, receive feedback and progress on the work. The GitHub repository for the project is here: https://github.com/satRdays/quarto-satrdays-template and a preview of the template is available too:  https://satrdays.github.io/quarto-satrdays-template/ 

useR! 2024 is already using this template for the useR! 2024 conference’s website

Please we would like to get your feedback (issues, feature requests, etc) in the issues section.

Giving Back to the R Community in Edmonton, Alberta, through Open Source Package Development and Educational Blogs

By Blog

Peter Solymos of the Edmonton R User Group (Yegrug) recently spoke to the R Consortium about the growing acceptance of R in the industry in Edmonton. He also discussed his professional journey with R and the challenges of organizing an R User Group in the post-pandemic era. Peter actively contributes to the R community through his work as an open source package developer. He also maintains a blog about hosting data apps, aiming to provide all the required information in one place. 

Peter proudly holds someone else’s Oscar while thinking: winner, winner, chicken dinner

Peter is an ecologist and Data Scientist with a PhD in Biological Sciences from the University of Debrecen, Hungary. He currently works as a Senior Data Scientist at E Source.   

Please share your background and your involvement in the RUGS group or the R Community.

My background is in ecology, and I used to study animals in the field. Then, I became interested in multivariate statistics and started using R. At that time, I was teaching at the University of Veterinary Medicine in Budapest, Hungary. All the students were learning R as a part of their stats course. I felt a bit behind because I was still working with Excel. So, I had to upskill and learn R, which was beneficial as it helped me secure a job in Canada.

I’m also a package developer, and I have published many statistically focused packages. I have also developed some non-statistical packages, like how to unify progress bars in R, which is my most downloaded package now. I love developing packages as an open source developer and focus on interesting questions. In my day job, I am a senior data scientist working with utility companies to assess and mitigate risk. On the side, I am also involved in the R community and doing a bit of consulting here and there. I also like developing Shiny apps.

I got involved with R because I needed it for my personal development. In academia, I haven’t met anyone who is not using R or is not at least familiar with it. We started the R user group at the University of Alberta in 2012. It took a hiatus for a couple of years, and we restarted it in 2021 after the pandemic. Last year, we hosted seven meetups, and this year we just started. You can find more about the Edmonton R User group events and links to past recordings on our GitHub.

I think after the pandemic this synchronous way of meeting – especially if it is online first – is becoming a challenge. The reason I am saying this is that sometimes very few people show up at the physical events and it’s not because they are not interested, but they think of it like a YouTube video that they can watch later on in their own time. This is an interesting change and I am not saying it is bad. I think this is expected and I would do the same in their position. But as an organizer, it puts you under pressure to make the events more interesting and to be able attract more attendees. This might just be the new way of life post-pandemic that we have to accept. Maybe we should talk about interesting topics with those who show up for the events and then the rest can watch it later.

Montage from the Edmonton R User Group’s past meetups.

Can you share what the R community is like in Edmonton? 

In Edmonton, the academic community is utilizing R in various fields. Ecologists are using it for their conservation-related or resource management questions. Alberta is a big province with a comparatively small population. It’s mostly remote and covered with forests. And in that forest, there is a lot of resource development going on, for example, oil and gas, forestry, mining, etc. It is really important to understand the effects, and there is a lot of spatial data analysis going on and R is well-suited for that. So this is one area that I know particularly well. 

There has been a growing acceptance of R in the industry. In the past, companies only wanted their employees to work with Python. Right now, companies don’t mind which language you used to get your job done and R is a part of these languages. This change is fortunate for R developers and data scientists in how the industry approaches their data science stack. In government and health sciences, R is being heavily used and a lot more prevalent than Python. SAS and other tools were more common in the past in these areas and right now R is the dominant tool. 

Please share about a project you are currently working on or have worked on in the past using the R language. Goal/reason, result, anything interesting, especially related to the industry you work in?

A few years after Shiny was introduced, I started playing around with it and got into app development, mostly for teaching. In the beginning, we had some workshops, and it was a cool way to demo things without having to write code. Later, I started using it for consulting and I started looking for ways to host apps so that they are not visible to everyone as I built them for a client.

There are various ways to host Shiny, like Shiny apps. It is nice to begin with, but people tend to outgrow it at some point if they have particular requirements. In 2017, I started using ShinyProxy which was almost brand new. ShinyProxy is used to deploy Shiny and other apps in a dockerized environment. It lets you authenticate and authorize users to allow them to see the information you want them to see. As time went by, ShinyProxy has developed and I have gotten better at working with it. 

I believed this was common knowledge because I could find the information and learn it myself. But at some point, I realized that people might struggle with the setup because all the information is pretty scattered on the Internet. Usually, you find tutorials that have a narrow focus. So I started writing blog posts about Shiny hosting and I think now it is past 60 posts. I titled the blog Hosting Data Apps. I started following a table of contents that I had in mind at that point. What is Shiny? What are the various ways of hosting it, and how can you learn how to use Docker or ShinyProxy? How to set up a custom domain or HTTPS, and authentication, or how to scale your apps. People have been interested, and we have received very positive feedback, which has led to a book deal with CRC Press for me and my co-author Kalvin Eng.

This coming year, we are going to focus on writing this book titled “Hosting Shiny Applications for R and Python.” We are going to cover questions like how do you pick the best hosting option? Once you pick the best option for your use case, how can you implement it, and what are the considerations you need to think about?

I find it interesting that it is so easy to learn Shiny these days. I know people who have never touched it but are good R programmers and could learn it in a few days. But once you want to share your app, eventually you start thinking about what else you can do. It’s hard to find everything in one place. This is what motivated me to start writing the blog with a book in mind. And right now we are getting closer to the end ‌goal. 

What trends do you currently see in R language?

I see a huge interest in cross-language reproducibility like Quarto and R Markdown. There is an interest in how to build beautiful things for the web as products, books, and websites and how to make everything reproducible. Even though Quarto’s ecosystem is not as well developed as knitr and R Markdown, people are using it for everything. They don’t care about the lack of tools, and that’s how the ecosystem is developing. If something is missing, they just figure it out and now it exists, and everyone can use it. I think this drive that if I can’t find something, I will build it, is what’s taking the R community forward. This approach is what makes the community so valuable and welcoming.

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups worldwide organize, share information and support each other. We have given grants over the past four years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute.

R Conference 2023: Malaysia’s Largest Face-to-Face Annual R Conference 

By Blog

The R Consortium recently reconnected with Poo “KH” Kuan Hoong, founder of the Malaysia R User Group (Also on Facebook). We talked with KH last year when he spoke about Hosting Malaysia’s Largest Annual R Conference. KH is deep into planning and preparation for this year’s R Conference, the first time it will be held face-to-face in two years! R confeRence 2023 is Malaysia’s largest physical R User Conference. It features Malaysian industry leaders and academicians. The conference will be held at Sunway University on October 28, 2023. Please come and join the Malaysia user group if you are near Subang Jaya, Malaysia. You will learn, explore, and more.

What’s new with the Malaysia R User Group (MyRUG)?

Since our last chat, we’ve expanded our outreach by collaborating with Malaysia R-Ladies. Initially, there was only one R user group in Malaysia. However, we felt the need to support and inspire our female participants. So, we established the R-Ladies chapter for Malaysia.

Our annual R ConfeRence is ongoing. For the past two years, the pandemic forced us to host it online. But this year, we’re thrilled to announce it’ll be a physical event at Sunway University. At the moment, we’re planning for 15 talk sessions, each spanning around 40 to 45 minutes, including a Q&A. Furthermore, we’ve scheduled five hands-on workshops, each about one and a half hours.

The event is set for the 28th of October, and registration is open! You can register here!

One challenge we’ve contemplated is the shift from online to a physical format. During online events, it was easier to involve international figures or speakers. But with our event now taking place in Kuala Lumpur, the capital of Malaysia, travel becomes a concern. Thankfully, we have some generous sponsors and consultants helping us, including help from the R Consortium. We’re offering travel and accommodation allowances for our speakers, ensuring they can join and present in person.

At the R confeRence 2023 this year, who is speaking, and what is the focus?

 Malaysia R User Group in confeRence 2019

Regarding this year’s topics and speakers, we’re focused on four main themes:

  • Machine Learning: An exploration of R’s capabilities in the realm of predictive analytics.
  • Epidemiology and Medical Statistics: We’re particularly excited about this, as we’ve partnered with a prominent medical school in northern Malaysia. This theme will feature speakers primarily from public health backgrounds who utilize R for their analyses.
  • Data Visualization: R is renowned for its data visualization strengths, and this theme will delve into the latest techniques and best practices.
  • Data Management and Deployment: The final theme will address the challenges and solutions in handling and deploying data using R.

These are our central themes, and we’re thrilled to have a diverse group of speakers representing each of them this year.

Regardless of the industry someone hails from, attending offers a unique chance to gain insights, network with R professionals, and learn from the best in the field.

Please share about a project you are currently working on or have worked on in the past using the R language. Goal/reason, result, anything interesting, especially related to the industry you work in.

I’m currently a senior manager at British American Tobacco (BAT). In the data analytics department, our primary focus is on Revenue Growth Management (RGM). One of my personal projects is developing a Marketing Mix Modeling (MMM) tool. Essentially, this tool analyzes promotions across various channels and calculates the return on investment (ROI) from marketing expenditures for each channel.

I’ve chosen to build this tool using R because it offers powerful libraries, particularly for time series analysis. I’m also integrating various R packages into the tool to enhance its capabilities. Once I’ve completed this tool, my goal is to present it to my superiors as a valuable resource for assessing the effectiveness of our marketing investments across different products and channels in the company.

What resources/techniques do/did you use? (Posit (RStudio), Github, Tidyverse, etc.)

I’m increasingly relying on cloud platforms for my work. For instance, I frequently use Google Colab because it’s a convenient environment to execute R code. It offers the advantage of running code without any local installations, which is highly beneficial, especially when giving presentations.

Before this, I was using RStudio Cloud, now renamed RStudio Workbench.

Two months ago, I gave a talk on the mlr3 R package. This package addresses a challenge in R: the presence of multiple packages with overlapping functionality. In the realm of Python machine learning, there’s a standard library called scikit-learn, which provides comprehensive end-to-end machine learning tools. I believe R needs something similar, and that’s where the mlr3 library shines. It’s a unified package for training machine learning models in R, and I introduced it in my talk.

When I deliver talks, I choose Google Colab as my go-to platform, primarily because it’s stable and user-friendly. I simply share a Colab link with participants, allowing them to run and interact with the code directly in their browsers without any installations.

For reference, I can share the recording of my “mlr3” presentation along with the associated Google Colab notebook:

Is this an ongoing project? Please share any details or ways for someone to get involved!

Currently, it’s still in progress and not ready for public sharing. Once it’s complete, I’ll provide the GitHub link. Our goal is to make it open source, allowing others to view and contribute.

What trends do you currently see in R language and your industry? Any trends you see developing in the near future?

In our data analytics industry, many professionals are gravitating towards Python. It’s essential to continuously demonstrate R’s capabilities, emphasizing that it can perform many tasks at which Python excels. We should leverage the latest technologies that spark excitement and interest to showcase the benefits of using R. It’s a user-friendly language that provides machine learning and production-grade application development tools.

In the past, we’ve demonstrated R’s compatibility with TensorFlow, PyTorch, and other tools. While Python has strengths, one can still achieve impressive results with R without delving deep into complex programming. 

Looking ahead, our objective should be to ensure no noticeable difference in applications developed using either Python or R. The workflow should be consistent regardless of the chosen language. For instance, the Shiny platform now supports both R and Python and regardless of which language it’s written in, the output remains consistent in appearance and functionality. The key is achieving uniformity across platforms and results.

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups worldwide organize, share information and support each other. We have given grants over the past four years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute. We are now accepting applications! 

Shiny App Successfully Reviewed by FDA CDER Staff (Pilot 2 Announcement 2)

By Announcement, Blog

The R Consortium is pleased to announce that on Sept 27, 2023, the R Submissions Working Group successfully completed the follow-up to the pilot 2 R shiny based submission and received FDA CDER response letter! 

To our knowledge, this is the first publicly available submission package that includes a Shiny component. 

The full response letter can be found at: https://github.com/RConsortium/submissions-wg/blob/0f1dc5c30985d413f75d196c2b6caa96231b26ee/_Documents/Summary_R_Pilot2_Submission%2027SEP2023.pdf  All submission materials can be found at:  https://github.com/RConsortium/submissions-pilot2-to-fda

The objective of the R Consortium R submission Pilot 2 Project is to test the concept that a Shiny application created with the R-language can be bundled into a submission package and transferred successfully to FDA reviewers. 

The initial submission was submitted through the eCTD gateway on Nov 18, 2022. FDA verbal responses were received from Jan-June 2023 during R submission working group meetings. The FDA response commented on minor findings on warnings and recommended best practices on filters, documentation and package source. The updated submission package addressed reported issues and was submitted on July 19, 2023. The final response letter from FDA was received on Sept 27, 2023.

As a next step, the R Consortium R Submission Working Group initiated submission pilot 4, to explore the use of novel technologies such as Linux containers and web assembly to bundle a Shiny application into a self-contained package, facilitating a smoother process of both transferring and executing the application.

The initial announcement of the R Consortium R Submission pilot 2 can be found at: 

Announcement of the R Consortium R submission pilot 1:

Empowering Healthcare with R: Javier Orraca-Deatcu’s Journey from Finance to Predictive Health Models

By Blog

Javier Orraca-Deatcu of the Southern California R User Group (SoCal RUG) highlighted his work at a health insurance company for quality of life improvements through data science models. He uses R for predicting health issues in Medicare and Medicaid populations and alerting their care management teams about preventive measures and lifestyle changes to prevent these issues. He shares in-depth insight into how R has the potential to impact the lives of the elderly population. He also advises R users to start their own R user groups within their companies to expand their network.

Javier has a Bachelor’s degree in Management from the Georgia Institute of Technology. He also holds a Master’s in Business Analytics from the University of California, Irvine (UCI). He works as the Lead Machine Learning Engineer at Centene Corporation. Javier co-organizes the SoCal RUG and is also a co-founder of the Centene R Users Group. 

Please share about your background and your involvement in the R Community. What is your level of experience with the R language?

My traditional background before I pivoted into data science was financial modeling. I was doing valuations and assisting with the lifecycle of mergers and acquisitions. At the time, I was maximizing the capabilities of Microsoft Excel and Microsoft Access. Around 2015, I started hearing chatter about “data science” and became interested in how to augment my financial forecasts and simulations with code. I quit my job, returned to graduate school, and was able to dedicate all of my time to learning about business analytics, data science, and R.

The Southern California R Users Group conducts an annual hackathon, which they host in collaboration with the UCI. I was enrolled in the Master of Business Analytics program at the UCI, and I joined the hackathon with a team of classmates. That’s when I met the organizers of SoCal RUG, and it was great to network with like-minded members of the data science community.

What industry are you currently in? How do you use R in your work?

I work for a healthcare insurance company, Centene, and it was pretty exciting to see us become a Fortune 25 corporation earlier this year. Our core business is to manage care insurance products covering mostly government-sponsored healthcare insurance like Medicaid and Medicare. We are fortunate to have robust access to health data since we manage our members’ care. As a data scientist and MLE at Centene, it’s exciting that I can apply modeling techniques to support quality-of-life improvements, such as early disease identification.

Why do industry professionals come to your user group? What is the benefit for attending?

The big appeal is to network with a very diverse group of professionals in many different industries. The opportunity to come together and brainstorm problems is invaluable. We provide technical workshops and monthly meetups with new information to learn from our presenters and passionate members of the R community. 

What trends do you currently see in R language and your industry? Any trends you see developing in the near future?

I’m doing less data storytelling these days with interactive visualizations and web apps, but the evolution we are seeing with Shiny and new opinionated Shiny frameworks has been a delight. Shiny is quickly maturing as a web app framework, and it’s becoming, in my opinion, a go-to enterprise web app framework with more and more recognition. In addition, the tidyverse and tidy syntax have made it much easier for non-programmers (people without a computer science or software engineering background) to adopt R for their business needs.

Throughout grad school and professionally, I have had exposure to several other languages, including Python and Julia. But for data analysis and manipulation, it’s just so easy to do with R. The rate at which the tidymodels framework matures has also been impressive. This modeling framework acts as a sort of meta-engine that helps data scientists develop reproducible, end-to-end machine learning workflows and pipelines. The ability to develop a modeling workflow purpose-built for fast iteration, consumption at scale, and production are all big wins for corporate data scientists and MLEs.

Anything else you would like to share with R Users around the globe?

For R enthusiastic professionals who regularly educate their coworkers about modern use cases for R. I recommend starting an internal R user group at your company — it’s a great way to network internally and share knowledge. I helped start the Centene R Users Group in 2019, and in six months, I saw our monthly meetups grow from less than 10 people to over 100 participants. I’ve met a lot of new teams and groups that I would have otherwise never had the opportunity to come in contact with.

A Continental Movement: LatinR Event is Face-to-Face in Montevideo, Uruguay This Year

By Announcement, Blog

LatinR 2023 is being held October 18-23, 2023, in Montevideo, Uruguay. Presentations are held in Spanish, Portuguese, or English. Register now! 

LatinR was founded in 2017, Natalia da Silva, Riva Quiroga, and Yanina Bellini Saibene are the actual chairs in the 2023 edition. 

The inaugural LatinR event kicked off in Buenos Aires, Argentina, followed by a second chapter in Santiago de Chile, Chile. These early events were more than just conferences; they were festive gatherings that celebrated the robust bonds the community had built over the years.

It wasn’t always easy. Da Silva, Quiroga, and Saibene have talked about the trials and tribulations of building such a successful and energetic R-focused conference.

The R Consortium has regularly been an enthusiastic sponsor of LatinR, and this year is no exception.

During COVID-19,  Zoom rooms replaced physical auditoriums, but the essence of shared learning and vibrant exchange persevered. But let’s face it, we’ve all missed the face-to-face interactions, the hallway conversations, and, of course, the local flavors of each host city. That’s why there’s a buzz of anticipation for this year’s event: LatinR is returning to its roots with an in-person conference on October 18-20th, 2023. Pack your bags because we’re heading to Montevideo, Uruguay!

LatinR has evolved into the cornerstone event for the R community in Latin America, a testament to what can happen when a community comes together with purpose and enthusiasm. The Montevideo edition of LatinR isn’t just another conference; it’s a homecoming, a celebration of how far we’ve come, and a toast to the exciting road ahead. So, whether you’re an R newbie or a seasoned data scientist, LatinR promises to be a confluence of minds, methods, and, most importantly, people united by a common language, both in code and culture.

For more details and to catch all the updates, don’t forget to visit https://latin-r.com/en/

Register here!

Hope to see you there!

Utilizing R for Reproducible Open Science Research in Tucson, Arizona

By Blog

The R-Consortium recently talked to Adriana Picoral of the R-Ladies Tucson about the diverse R community in Tucson, Arizona. Adriana founded the R-Ladies chapter in 2018 and has been actively involved with the local R Community. 

The group is hosting a virtual “Reproducing Open Science Research-2” event on September 15, 2023. The event focuses on reproducing an open science research paper in linguistics with experimental data. 

Please share about your background and involvement with the RUGS group.

I am an assistant professor of practice at the Department of Computer Science at the University of Arizona. My educational background includes a bachelor’s degree in computer science and a Ph.D. in applied linguistics.

My journey with R and involvement with the R community began during my graduate studies. When I was a doctoral candidate, my research focused on quantitative analysis. As a result, I had experience with other programming languages, but I used R for the first time in 2014 for my research. Unfortunately, Tucson had no R-Ladies chapter, so I wanted to establish a local presence. Therefore, in 2018, I founded the R-Ladies Tucson.

It has been five years since I started this chapter. Many of our events initially focused on linguistics and applied linguistics, which was my study area as a graduate student. In 2020, after successfully defending my thesis, the onset of the pandemic forced our group to shift our events online. This change helped us connect with people from all over the US. We had “Tidy Tuesday” challenges at our weekly virtual meetings with selected datasets.

Can you share what the R community is like in Tucson? 

The R community here in Arizona and Tucson is well established. I’m part of the University of Arizona, which has many Data Science programs across different departments and colleges. Although Python plays a role, the predominant focus in these programs is R. I also co-direct an initiative called the Data Science Ambassadors program, which engages graduate students. 

The R community is diverse regarding academic backgrounds, including individuals from the biology, statistics, and computer science fields. I have been the ambassador for Women in Data Science, which is also focused on R. It is diverse in terms of backgrounds but maybe not as diverse in gender identification, but we are working towards that.

You have a Meetup on Reproducing Open Science Research 2. Can you share more on the topic covered? Why this topic? 

In this meetup, we will replicate open science research. This meetup is the second event of the Reproducing Open Research Series. We chose the paper “Learning, Inside and Out: Prior Linguistic Knowledge and Learning Environment Impact Word Learning in Bilingual Individuals” within the linguistics domain and features experimental data.

We will review the paper’s analysis, facilitating its replication while educating the participants about the process. Open science is really important, and having the data available is nice. Before working on your data, engaging with external data often provides a valuable learning opportunity.

Who was the target audience for attending this event? 

R-Ladies’ events ‌attract women, but we also welcome participants identifying as other genders. The event is aimed at graduate students lacking quantitative analysis training, focusing on language data and open science. So, I would say the target audience is women and graduate students.

Any techniques you recommend using for planning for or during the event? (Github, zoom, other) Can these techniques be used to make your group more inclusive to people unable to attend physical events in the future? 

After the pandemic, having experienced Zoom, we prefer to host most of our events online. Virtual events are much more inclusive as participants and the speaker don’t need to commute. Another proper technique that helps participants who do not have software installed on their systems is using Posit Cloud. We use Posit Cloud with our Rstudio ID, so they don’t have to install anything. I demonstrate all the steps from the beginning on how to start a new project on Posit Cloud and go from there. 

I also made a tutorial beforehand for the participants. We don’t record our sessions, as it encourages attendees to participate more openly and makes the events more interactive.

R User Group Philippines Turns 10

By Blog

The R User Group-Philippines (RUGPH) celebrated its 10th anniversary on the 16th of August. The group marked the occasion with its first physical event since the pandemic, and it highlighted the group’s progress over the past decade. 

The RUG-PH hosted 115 events in the past decade, making it one of the most persistent RUGs. During the pandemic, many RUGs struggled to remain active; however, RUG-PH continued with online events.  

Joe Brillantes and Michelle Alarcon are the two faces behind the group’s success and brilliant track record. The R Consortium recently talked to Joe and Michelle regarding the group’s evolution. They shared their journey with R in their work and their experience keeping the group up and running for a decade. They have also witnessed a growing acceptance of R in the Philippines and the industry. 

Please share about your background and involvement with the RUGS group

Michelle: My name is Michelle Alarcon, and I have been an analytics practitioner since 1999. I used commercially available software at university and brought it with me to the jobs I had. Open source tools were a minor part of my toolkit in practice. In 2013, I founded my analytics consulting firm, Z-Lift Solutions. As a consultant, I aimed to avoid being bound to software vendors that clients might have purchased, such as SAS or SPSS. So, I began searching for a versatile tool that would allow us to offer consultation without being locked into any particular vendor.

That’s when I discovered R, which was unpopular in the Philippines back then. However, R was gaining popularity among practitioners striving to learn analytics without heavy investments. I asked a former classmate from my old school, the University of the Philippines School of Statistics, for advice when I started my consultancy. I wanted to know the tools used by the new generation of statisticians. To my surprise, the curriculum remained largely unchanged over two decades. This realization led me to explore alternatives.

My efforts to ensure a consistent talent pool for consultancy drove me to get connected with Edward Santos, a key figure in the history of the R Users Group. I also connected with Joselito Magadia, a university professor who played a crucial role in the Philippines’ CRAN network. Through Edward and Joselito, I got introduced to the R Users Group. Our annual R Users Group anniversary celebrations often include Edward.

Joe: I’m Joe Brillantes, and I first encountered R in 2007 during my studies in the US. My mathematical statistics instructor introduced me to it. While I initially leaned towards software like MATLAB or Maple, my perspective shifted when I returned to the Philippines. Because of a tight budget, I had to create a portfolio optimization model for a shipping company without using expensive software. R emerged as a more feasible solution.

When I started using R, there wasn’t a community of R Users in the Philippines. Since I was new to it, I asked many questions, mainly to my classmates or other R users in the US. And then, someone started a Google group on R users specific to the Philippines, and that’s when I joined it. It seemed very appealing to me, as I no longer needed to ask people in the US and then wait for them to respond because of the different time zones. 

We did not start the Philippines R Users Group (RUGS); credit goes to Edward Santos. However, I co-organized the group alongside Michelle for the past decade. My commitment to R persisted throughout my career, replacing MATLAB and other software in my toolkit.

Can you share what the R community is like in the Philippines?

Michelle: In the past decade, I’ve witnessed an increasing acceptance of open source programming tools in the Philippines. In 2013, awareness of open source options was scarce. AWS was pivotal in promoting open source use, joined by Java‘s long-standing presence. However, the acquisition of Revolution Analytics by Microsoft was a turning point. Microsoft, on our request, provided us space for hosting our meetups, which were happening at coffee shops before that. Microsoft’s support showcased a shift toward open source.

A decade later, R has gained acceptance as a staple tool for data scientists and analysts, often mentioned alongside Python. Our user group collaborates with other tech communities, like AWS and Python. Interest in R has increased over the years. However, our meetup attendance has plateaued, maintaining a consistent level of participants even during the pandemic when we held virtual events. 

Joe: Today, data science and analytics practitioners in the Philippines typically gravitate toward Python or R. Both languages are considered essential tools. The open source nature of R fosters acceptance within organizations. If an employee is proficient in R, they typically approve its usage due to familiarity. However, an area for further growth is in deploying models. The deployment of predictive and prescriptive analytics models in production remains limited. R is commonly used in data science but not widely in production environments.

You had a Meetup RUG_PH 10th Anniversary. Can you share some details of this event?

R Users Group-Philippines 10th Anniversary Event

Joe:  We recently celebrated the R Users Group – Philippines’ 10th anniversary. We wanted it to be special, so it was also the first time we organized an in-person meetup since the pandemic ended. There were around 20 people who attended, half of whom had attended numerous meetups in the past, while the remaining were first-timers. We were pleasantly surprised that a substantial portion of attendees were first-timers because that indicated that R usage and user groups still have significant growth potential in the Philippines.

Because it’s an anniversary, the primary topic was to review how we’ve grown and changed over the years. Our event venues changed from cafes to company offices to online. Our participants became more diverse in terms of backgrounds and moved from predominantly analysts to a mix of data engineers, data scientists, software engineers, and managers. Participants come from Metro Manila to other areas in the Philippines and even abroad. We had dinner, an icebreaker, a raffle, and networking at the event.

Some participants also volunteered to discuss data visualization for scientific publication and causal inference in future meetups. We will promote these meetups to the community for future events through our Meetup page, Slack workspace, and Facebook page. We’re always happy to see familiar faces and to meet new R users.

Any techniques you recommend using for planning for or during the event? (Github, zoom, other) Can these techniques be used to make your group more inclusive to people that are unable to attend physical events in the future?   

Joe: We encourage presenters to share their materials soon after the meetup. They usually share them through GitHub or shared drives like Google Drive, OneDrive, or Dropbox. We started recording the meetups and plan to share them on our Facebook page. We do these to help attendees continue learning, reach those who couldn’t make it, and encourage future attendance.

We’re still exploring the best way to do hybrid meetups. People attending online usually feel left out in hybrid meetings because of low-quality equipment and lousy internet. Speakers usually select the in-person format as it requires less time and effort than preparing for a hybrid setup. We’re still figuring out the best way to have hybrid meetups that do not isolate online attendees. In the meantime, we ask presenters their preferred format: in-person, online, or hybrid. The voted-out meetup setup would likely be because the presenters are the best people to decide how their content can be best communicated.

I would also like to take this opportunity to reach out to RUGs around the globe. We at the RUG-PH are excited to be part of the global R community through the R Consortium. We look forward to collaborating with other RUGs and welcoming participants from around the globe. 

How do I Join?

R Consortium’s R User Group and Small Conference Support Program (RUGS) provides grants to help R groups around the world organize, share information and support each other. We have given grants over the past four years, encompassing over 65,000 members in 35 countries. We would like to include you! Cash grants and meetup.com accounts are awarded based on the intended use of the funds and the amount of money available to distribute. We are now accepting applications!

First Publicly Available R-Based Submission Package Submitted to FDA (Pilot 3)

By Announcement, Blog

The R Consortium is pleased to announce that on August 28, 2023, the R Submissions Working Group successfully submitted an R-based test submission pilot 3 package through the FDA eCTD gateway! The FDA CDER staff are now able to begin their evaluation process. All submission materials can be found at: https://github.com/RConsortium/submissions-pilot3-adam-to-fda 

The pilot 3 test submission is an example of an all R submission package following eCTD specifications. These include the installation and loading of the proprietary {pilot3} R package and other open-source R packages, R scripts for the analysis data model (ADaM) datasets from pilot 3 and tables, listings, figures (TLFs) from pilot 1, analysis data reviewer’s guide (adrg), and other required eCTD components. To our knowledge, this is the first publicly available R-based FDA submission package, which includes R scripts to generate ADaM datasets and TLFs. We hope this submission package and our learnings can serve as a good reference for future R-based regulatory submissions from different sponsors. Additional agency feedback will be shared in future communications.  For any future questions, you may contact the pilot 3 team here: https://rconsortium.github.io/submissions-pilot3-adam/main/index.html.

The working group also began working on a pilot 4 project to explore the use of novel technologies such as Linux containers and WebAssembly software to bundle a Shiny application into a self-contained package in order to facilitate a smoother process for transferring and executing the application. Stay tuned for more about pilot 4 in the future.

For past announcements on pilot 1 and pilot 2, see below.

Announcement of the R Consortium R submission pilot 1:

Announcement of the R Consortium R submission pilot 2, an R based test submission with a shiny component:

https://www.r-consortium.org/blog/2022/12/07/update-successful-r-based-package-submission-with-shiny-component-to-fda

About the R consortium R submission working group

The R Consortium R Submissions Working Group is focused on improving practices for R-based clinical trial regulatory submissions.

To bring an experimental clinical product to market, electronic submission of data, computer programs, and relevant documentation is required by health authority agencies from different countries. In the past, submissions have been mainly based on the SAS language. 

In recent years, the use of open source languages, especially the R language, has become very popular in the pharmaceutical industry and research institutions. Although the health authorities accept submissions based on open source programming languages, sponsors may be hesitant to conduct submissions using open source languages due to a lack of working examples.

Therefore, the R Consortium R Submissions Working Group aims at providing R-based submission examples and identifying potential gaps during submission of these example packages. All materials, including submission examples and communications, are publicly available on the R consortium Github page: https://github.com/RConsortium.

The R consortium R submission working group includes members from more than 10 pharmaceutical companies, as well as regulatory agencies. More details of the working group can be found at: https://rconsortium.github.io/submissions-wg/.

The R consortium R submission working group is open to anyone who is interested in joining. If interested, please contact Joseph Rickert at joseph.rickert@gmail.com

R Validation Hub’s {riskassessment} Application – Mini Series Part 2

By Blog

The R Validation Hub – a working group established within the R Consortium to support the adoption of R within a biopharmaceutical regulatory setting – held a two-part mini-series about their {riskmetric} package and {riskassessment} application. 

The full talk is available here. Part 1 is available here.

In the second part of the mini-series, the team explained in depth how the {riskassessment} application helps those making “package inclusion” requests for GxP environments, which means the application empowers users to assess package risks themselves before making an IT request. It arms them with the criteria they need to show a package meets (or fails to meet) their organization’s unique set of requirements. 

The highlight of the talk was covering upgrades and improvements made to the application. 

Here’s a breakdown of what’s new:

  • Valuable Enhancements: Aesthetic & functional enhancements were made to the ‘Report Builder’ and ‘Database Viewer.’
  • In-depth Analysis: The app now boasts enhanced support for analyzing package dependencies.
  • Tailored Customizations: More organizational-level adjustments, including a configuration file for a bespoke experience.
  • Admin Capabilities: Admin users now have the power to modify roles and privileges. This ensures a seamless workflow by determining who should partake in the review processes.
  • Explore with Ease: A new feature allows users to delve into the source contents of a package through a file browser, making exploration straightforward and comprehensive.

The R Validation Hub team also shared a sneak peek of some exhilarating features, such as {riskscore}; there’s also more in store for package exploration within the app.

A Special Note on GSK’s Contributions

GSK Collaborators have generously contributed code that enhances the user experience. This new feature will enable users to delve deeper into exported functions. Imagine perusing function-level source code, documentation, and tests in one unified and easily navigable user interface. Thanks to GSK, this will soon be a reality!