R Community Explorer

By August 12, 2019 August 13th, 2019 Blog

by Ben Ubah, Claudia Vitolo and Rick Pack

Introduction

One of the most important qualities of the R Language is its thriving community. The R community has a reputation for being particularly friendly, welcoming and cohesive, which has enhanced its adoption and expansion. R user groups have accordingly flourished, especially in recent years.

In this year’s Google Summer of Code program, the proposal, “Data-Driven Exploration of the R Community” was selected. For this, the project’s developer, Ben Ubah, thanks the project’s mentors, Claudia Vitolo and Rick Pack for their contributions.

The primary motivation for this project was the need to have a consistent, data-driven, automated dashboard that provides a broad overview of global R User Groups and R-Ladies Groups.

The R Consortium and other stakeholders have invested in community expansion and sustenance initiatives like R-Ladies, R User Group Support (RUGS) program, Event Sponsorship, RCDI-WG and SatRdays.These promote the learning and adoption of R in many under-represented regions. They have also significantly enhanced community engagement.

As the R community has progressed, there does not appear to have arisen a way to track its global user groups’ inception and activity. Is there a way to find out which regions require more representation? How do we recognize the efforts of organizers who put in a lot of effort to organize events that sustain user groups? How do we easily locate and recognize the most active groups and perhaps learn from their successes? Could we somehow ascertain the impact of the initiatives set by the R Consortium and others on a global scale? Could there be a unified platform dedicated to exploring the R community in an open-ended curiosity-driven fashion? These were the thoughts that inspired this project.

While this project is in its infancy, we have started seeing some encouraging results after the first coding phase of Google Summer of Code. It is our hope to share with you what we have achieved so far and receive welcomed feedback, if you are so inclined.

R-Ladies Groups

Since the R Consortium first funded the R-Ladies initiative, there has been a sporadic diffusion of their chapters and members globally. Perhaps partially as a result of having a consistent leadership compositon and funding, R-Ladies groups are mostly managed on meetup.com, and share a common naming convention. This makes it quite easy to find them on meetup.com and explore their data from the meetup API.

Chart showing Growth of R-Ladies Groups over the years

In the first phase of Google Summer of Code, this project explored a way to track R-Ladies Groups globally from the meetup API, using the meetupr package developed by R-Ladies.

This exploration was intended to be completely data-driven, automated but rendered via a static dashboard that would be hosted via GitHub Pages. R-Ladies already have a shiny dashboard, which only runs on a Shiny Server. Inspired by that dashboard, we developed one with some useful differences such as faster loading, additional aesthetic features such as thematic coloring, and additional tabular displays, charts and counts.

What Has Been Achieved

For the R-Ladies dashboard, the following were achieved:

  1. We used the meetupr package to extract R-Ladies Chapters from Meetup.com
  2. Improved the existing find_groups() and get_events() functions in meetupr to meet our requirements
  3. Transformed the data from Meetup to required formats
  4. Persisted the data on GitHub
  5. Developed a static HTML dashboard interface based on open-source Bootstrap template.
  6. Rendered the persisted data via the dashboard interface.
  7. Automated the process
  8. Deployed it via GitHub Pages

The Tools We Used

To accomplish the following, we used a mix of the tools listed below:

  1. R, RStudio and the following packages: meetupr, curl, jsonlite and leafletR
  2. Javascript and the following libraries jquery.js, d3.js, echarts.js, leaflet.js and lodash.js
  3. Gentelella Admin Dashboard Bootstrap HTML template
  4. Travis CI to build the project, execute R scripts and bash commands
  5. Bash commands to call R scripts and commit modified files to GitHub

How We Achieved it

  1. We used the meetupr package to retrieve R-Ladies Groups from meetup.com with an R script.
  2. We further analyzed this data and computed several summaries out of it. We used the leafletR package to transform our data frame to GeoJSON. We used this GeoJSON file to create a leaflet map using leaflet.js. In this map, R-Ladies groups are separated into three groups with markers of three color categories: Active (purple), Inactive (dark-purple), and Unbegun (orange). Active groups have had an event in the past 180 days or have an upcoming event in the future. Inactive groups have not had an event in the past 180 days and do not have an upcoming event. Unbegun groups have not had an event in the past and none are planned for the future.
  3. Persisted all data and our summaries in CSV / JSON files. After each Travis build, the data and our summaries gets updated straight from the Meetup API.
  4. We wrote bash commands to run our R scripts, and commit updated CSV / JSON files to GitHub after every Travis build.
  5. We setup Travis Cron Jobs, to build this project daily and update our data.
  6. We then, customized the Gentelella Admin Dashboard Bootstrap HTML template to our requirements.
  7. Rendered our summaries via widgets on this dashboard. Used Javascript/libraries to perform other simpler summaries and produce maps, charts and tables.

The Result

At the end we have an open-source dynamic dashboard for R-Ladies that is updated daily, but is built to be static and hosted via GitHub Pages. This could be seen as another approach to building information dashboards with R as a back-end technology, maintaining separation of business data-processing from data-presentation.

At the time of writing, there are 165 R-Ladies chapters composed of 50,000 + members, across 47 countries, 162 cities, with more than 1,580past events and many upcoming. 71% of R-Ladies chapters are active, 13% are inactive, and 16% are unbegun. Unbegun groups have members but have not started organizing events yet. Our observation is that members are added to the R-Ladies community daily.

The pop-up markers in the leaflet map display important information about each R-Ladies chapter including a link to the group’s webpage, number of events, status, inactive months, and how to become an organizer for inactive/unbegun groups.

Feedback

We are just starting this project and are in hopes of expanding its reach far beyond its current state. We would love to hear from you if you have any ideas or find issues. Feel free to Follow / Star the project at its GitHub repo: https://github.com/benubah/r-community-explorer/

Next

We have started working on general R user-groups and plan to report our progress soon with some lessons we have learned.