Category

Blog

Building the R Community in Southern Africa

By Blog

By Heather Turner, Chair of Forwards, the R Foundation taskforce for underrepresented groups in the R Community

In this post I will give the background to the Forwards Southern Africa 2020 project, for which we are running a crowd-funding campaign until February 5, 2020.

On March 6-7, 2020, Johannesburg will host the fourth satRday to be held in South AfricasatRdays are community-led, regional conferences, that support collaboration, networking and innovation within the R community. They were initiated by an R Consortium funded project, that ran pilot events in Budapest and Cape Town in 2016/2017. The conference series has been expanding around the world since then, with ten events in 2019.

For Joburg satRday 2020, I was invited to be a keynote speaker. As chair of Forwards, the R Foundation taskforce for underrepresented groups, I saw this as an opportunity to create an initiative focused on building the R Community in Southern Africa.

A first step was to offer a workshop on R package development, using the materials developed under the R Consortium project, Forwards Workshops for Women and Girls. This project ran package development workshops for women in New Zealand, Budapest and Chicago. Since there are still some funds left in the grant, we are able to offer some scholarships to women in Africa to attend the Joburg workshop and satRday. Women with visa-free access to South Africa may apply; the deadline for applications is midnight SAST, January 31.

The next step was to look beyond South Africa, to neighbouring countries. The following map shows cities in Africa with R-Ladies groups (purple), R User Groups (blue) or both (blue-grey):

The AfricaR consortium that took off at the start of 2019 has really helped to support the R community across Africa and has lead to the founding of several R User Groups, as well as the first satRday in East Africa (Kampala 2019) and the first satRday in West Africa which will take place in Abidjan, February 1, 2020. In Southern Africa, there are strong R User Groups and R-Ladies groups in both Cape Town and Johannesburg, but the R Community is only just starting to go beyond South Africa, with the establishment of Eswatini useRs last year.

UPDATE: The Adidjan satRday event was a big success! Here’s a photo of the full group. Videos of the talks should be available online soon.

The Forwards Southern Africa Project aims to build on this foundation, by organizing free workshops and meetups in collaboration with local partners in Eswatini, Botswana and Namibia. This project is also supported by the WhyR Foundation and AfricaR. The details of the events are still being finalised, but the planned itinerary is as follows:

Windhoek, Namibia (March 4, 2020, TBC)

In partnership with the Department of Statistics and Population Studies, University of Namibia:

  • Introduction to R for data analysis workshop (1 day)
  • Launch event of the first R User Group in Namibia

Manzini, Eswatini (March 11-12, 2020)

In partnership with the recently established Eswatini useR group. Registration is open for this 2 day event, that includes:

  • Introduction to R for data analysis workshop (1 day)
  • Data visualization workshop (1/2 day)
  • Meetup including talk on the R community and resources available for newcomers

Gaborone, Botswana (March 14, 2020)

In partnership with WiMLDS Gaborone and PyData Botswana:

  • Introduction to R workshop (1/2 day)

All these events can be supported via the crowdfunder where further updates will be posted. Updates will also be shared on the Forwards Twitter.

Uniting Local R Users in Spain – Users Murcia R (UMUR)

By Blog

By Aurora González-Vidal (president) and Antonio Maurandi (vice-president), Users Murcia R (UMUR)

UMUR (Users Murcia R) is an association whose first official act was the organization of the X National Spanish meeting of R users in Murcia (2018) which established an inflection point in this annual meeting. We brought two amazing speakers: François Husson, who accompanied us, and Julia Silge, who participated by video call. Since then, we have been doing meetings every other month (workshops, talks…) with attendance of 35-45 people. We are trying not only to unite local people but also to provide them the chance to meet their references in the R community and make them participants in the R spirit. Recently we also had the opportunity to meet Max Kuhn, at the XI National meeting of R users celebrated in Madrid (2019)

Our secret as a young organization achieving great numbers of participation is that we are formed at least by 2 small groups and individuals that were independently “spreading the word of R,” each of them in their own environments.

The first informal group is called 00Rteam. We are based in academia and teach a large number of R courses aimed at specific audiences at the university level: PhD students (writing scientific papers with Rmarkdown, introduction to R and Rstudio, data tabulation in R, hypothesis contrast in R, multivariate analysis in R), teachers (automatic learning with R) and administrative staff (R4U). 

Another informal group is made up a group of mathematicians and economists from the Faculty of Economics and Business, who are working on integrating R in their daily teaching. They use R and RStudio to create interactive pedagogic materials by exploring packages like rmarkdown, shiny, swirl and exams for teaching Statistics on the courses offered by the faculty.

Apart from that, our board of directors includes professionals who are actively spreading R in engineering businesses and banks and who have personal blogs and have authored manuals about R. This mix of interdisciplinary and enthusiastic people being in charge of an association has been able to attract a pool of interested people that is bringing us a lot of joy and knowledge interchange in Murcia. 

An important characteristic of UMUR is that the number of women on the board is higher than the number of men. We are sensitive to gender equality, and we want to be an example of parity in the technology space showing that we are diverse in many ways. We think this fact is the key to success. 

Signed: Aurora González-Vidal (president) and Antonio Maurandi (vice-president)

XI Conference of R Users (Madrid, Spain, Nov 14-16) Welcomes Over 200 Attendees

By Blog, Events

Thank you to Carlos Ortega, Principal Data Scientist, Teradata, for providing this summary and pictures from the conference

The XI Conference of R Users (XI Jornadas de Usuarios de R), held November 14 – 16, Madrid, Spain, was organized by the Asociación Comunidad R Hispano. The ambitious program and the invited international speakers made the participation massive, exceeding 200 attendees. The Conference was divided into two locations, Repsol (Spanish Gas and Oil company) and UNED (Spanish Distance Learning University), highlighting the university-business combination that has been one of the key factors in the success of the conference.

On Thursday, November 14, the opening ceremony was held at the Repsol Campus auditorium and attended by Emilio López Cano (president of the Asociación Hispano R Community), Julio Gonzalo (deputy vice chancellor for research at UNED), Enrique Dameno (Director of Digitalization and Integrated Customer Management of Repsol), and Teresa García (Repsol).

Max Kuhn (R Studio) gave a lecture on “Modeling in the Tidyverse,” and after that, in the round table “R in business,” the crucial role of data scientists in solving problems in diverse areas was covered. Raúl Vaquerizo (Pont Group), Noelia Ruiz (Mutua Madrileña), Jorge Ayuso (Telefónica España), Enrique Lasso (Repsol) and Carlos Ortega (Teradata) participated in the round table.

On the 15th and 16th, at the School of Education of the UNED, an extensive and vibrant program was developed with workshops, communications sessions, “lightning sessions,” poster sessions, round tables and invited conferences. Bernd Bischl (University of Munich) gave a lecture on MLR3, Jo-Fai Chow (H2O.ai) presented “Automatic and explainable machine learning in R,” and Max Kuhn gave a workshop on “Designing R modeling packages.”

Following the multidisciplinary philosophy of using R to handle any kind of data, communications sessions dealt with applications in genetics, data analysis, model and project management, society and culture, surveys and education, medicine and veterinary and economics and company. In addition to these monographic sessions, the “lightning sessions” dealt with many different topics.

A round table on Data Journalism was held to close the conference, moderated by Leonardo Hansa (R-Hispano) in which Virginia Peón (Indigitall), Alba Martín (Newtral), Antonio Delgado (Datadista) and Carmen Aguilar (Sky News) participated. The importance of knowing how to treat the data in an appropriate and honest way was highlighted, so that information that reaches the public is truthful.

In the closing ceremony, the prize for the Best Young Work of the Conference was announced, which went to Rocío Aznar Gimeno (Technological Institute of Aragon) for the work “Multilevel mixed models: An application of the lme4 library to estimate the fetal weight percentile in twin pregnancies.”

Sessions Available

Many of the sessions were streamed and recorded. They are accessible through the UNED Channel (Canal UNED): https://canal.uned.es/series/5dc3f7d05578f252041fc22d

R Consortium Infrastructure Steering Committee Chair Wins 2019 COPSS Presidents’ Award

By Announcement, Blog

Congratulations to our very own Hadley Wickham, Infrastructure Steering Committee Chairperson, for winning the “Nobel Prize of Statistics.” The award is given to a person under the age of 41, in recognition of outstanding contributions to the profession of statistics. According to Wikipedia, the COPSS Presidents’ Award, along with the International Prize in Statistics, are considered the two highest awards in Statistics.

The award citation recognized Wickham’s “influential work in statistical computing, visualization, graphics, and data analysis” including “making statistical thinking and computing accessible to a large audience.”

In previous years, the award has primarily recognized theoretical contributions to statistics. This year is the first time it has been awarded for practical application.

Hadley is Chief Scientist at RStudio, a Platinum member of the R Foundation, and Adjunct Professor at Stanford University and the University of Auckland. The skills with statistics runs in the family: his sister is an Assistant Professor of Statistics at Oregon State University.

Hadley builds tools – both computational and cognitive – to make data science easier, faster, and more fun. His work includes packages for data science – a pioneering a suite of tools for R known as the “Tidyverse”: including ggplot2, dplyr, tidyr, purrr, and readr – and principled software development (roxygen2, testthat, devtools). He is also a writer, educator, and speaker promoting the use of R for data science. Learn more on his website, http://hadley.nz.

Congratulations, Hadley!

Data-Driven Tracking and Discovery of R Consortium Activities

By Blog

by Benaiah Ubah

R is a fast-growing language for statistical computing and graphics backed by a powerfully inclusive community of users and developers. The R community received a significant boost when some enterprises came together to establish the R Consortium in 2015. Since then, the R Consortium has clearly proven its purpose by operating transparently and in an unbiased manner – supporting the R Foundation, infrastructure that broadly affects the R community, tools that enhance the R software, R user-groups, events and diversity on a global scale. R Consortium’s top level projects – R-Hub, R-Ladies, the RUGS program, Events sponsorship, the R Community Diversity and Inclusion program – , working groups and other ISC funded projects highlight the significance of R Consortium’s involvement as a major supporter of several critical developments around R in recent times.

To further enhance transparency, measure impact and achieve even greater community inclusiveness, the R Consortium in Fall 2018, funded a new data-driven initiative to provide a way for the R community to discover and track its activities over the years. This infrastructure is dedicated to curating and rendering R Consortium activities via dashboards using open-source technologies – all data and code are available at this GitHub repository that is primarily maintained by me.

For a start, the ISC approved the development of dashboards that highlight R Consortium’s accomplishments, with a focus on ISC Funded Projects, RUGS program and Events/Marketing program. I am delighted to communicate that, this initial scope has been successfully covered and the corresponding milestones delivered. The next iteration of development would include more aspects of R Consortium’s activities that have broad impact on the larger R community. The following sections of this article presents reasons why a data-driven initiative is useful for tracking R Consortium’s activities, the deliverables for this project, benefits and future directions.

Why a data-driven initiative to track R Consortium activities?

1. In the past 5 years, R Consortium has supported many R initiatives that encompass user-groups, events, diversity, technical infrastructure, documentation, developing teaching materials, working groups, etc But, how could the impact of these initiatives be measured in numbers over the years? How could the global distribution of activities like, the user-group and event support programs be ascertained?

2. ISC funded projects (both completed and ongoing) are usually curated on a single web page.  This initiative provides a way for searching for these projects by year, grant-cycle, status, primary investigator, etc. 

3. Before embarking on this project, there was no way of ascertaining the distribution of funding across work-products.  A data-driven infrastructure will help those without experience applying for ISC grants, by giving them an overview of work-products and cash-grant ranges that have received more funding over time.

4. R Consortium’s decision makers may find a data-driven initiative helpful in planning future programs and packages.

5. Prospective R Consortium members that are contemplating joining the R Consortium, could easily find and understand R Consortium’s past accomplishments in a broad, transparent, insightful, and aggregated manner.

6. Finally, comparing R Consortium’s mission statement with its accomplishments from a data-driven perspective, is something that the R Foundation, the global R community, present and future members of the R Consortium would like to track and provide feedback on over time, for the long-term growth and stability of the R ecosystem.

Project Deliverable

We  now present to the R community, a suite of dashboard pages that render the corresponding R Consortium activities  in a data-driven manner:

  1. ISC funded projects dashboard
  2. R User Group Support program dashboard
  3. Events / Marketing program dashboard
  4. A landing dashboard page that summarizes details from other dashboard pages for enhanced user experience.
  5. A GitHub repository to find all code and data for this infrastructure.

Benefits

  1. ISC projects dashboard: Easily find ISC projects with enough information to contact project owners for those thinking of contributing to projects. Find most popular work-products ad cash-grant ranges for those without experience applying for grants.
  • RUGS program dashboard: Understand the global distribution of funded user-groups and their funding-level distribution. Find information about these groups and how to get in touch with those within your reach.
  • Events / Marketing dashboard: Understand the global distribution of sponsored events.
  • Landing dashboard: Find aggregated summaries around all of ISC projects, RUGS program, Events/Marketing program and the R-Ladies project.

Future Directions

It would be interesting to explore more of R Consortium activities like working groups, and ISC projects that have observable global impact on a running basis.

Join R Consortium

If you are an enterprise that benefits from using the R environment, please consider joining R Consortium to make the R ecosystem a better one.

Acknowledgments

I appreciate the contributions that came from John Mertic, Hadley Wickham and Joseph Rickert especially at the initial phases of this idea.

Get Funded by the R Consortium – Call for Proposals Open Now!

By Blog

Strengthen the R community with Your Project

The R Consortium is committed to supporting the R community by funding projects that create important infrastructure and fortify long term stability for the R Community. The R Consortium’s Infrastructure Steering Committee (ISC) has developed a grant program that looks to help the broader R community.

The Call for Proposals opens today, September 13, 2019, and runs for a full month, through October 14, 2019.

This is the fourth year of funding, and over $1,000,000 has been given out in sponsorships and grants.

We encourage you to apply, even those without experience applying for grants.

Apply now!

In this round, the ISC is looking for projects that:

  • Are likely to have a broad impact on the R community.
  • Have a focused scope. Simple is better than over-ambitious. Larger projects can often be broken up into smaller steps.

The process for submitting a proposal has been has been updated annually to ensure that the process is as smooth as possible. Full details on proposal requirements, examples of previous projects, suggestions for what to avoid, and more, are included here.

Any questions about the proposals or submission process, please write to proposal@r-consortium.org

Apply now!


R Community Explorer – R User Groups

By Blog

By Ben Ubah, Claudia Vitolo and Rick Pack

We recently announced an R-Ladies focused open-source dynamic dashboard built using R and Javascript. That work has now been extended to encompass all R user groups organized through Meetup.com. You can find this new dashboard at this link and its code, here.

The R user group support program and the R-Ladies project, are featured in two out of three top-level R Consortium projects

How We Identified R User Groups on Meetup

Identifying all R user groups on Meetup.com required more effort than R-Ladies groups. While R-ladies groups are centrally created and their names follow a standard convention, the names of other R user groups are more difficult to predict.

We extended Curtis Kephart’s technique for using string matching to retrieve upcoming R events to:

  • Match among all data science groups on Meetup (7700 +) those with strings like “r user”, “r-user”,“r-lab”,“phillyr”,“rug”,“bioconductor”,“r-data”,“rug” in their Meetup URL names. We then performed a second round of string matching to search for strings like “programming-in-r”, “r-programming-”, “-using-r”, “r-language”, and “r-project-for-statistical” in the groups’ topics field.
  • Retrieve all user groups that mention “r-project-for-statistical-computing” in their topics separately.
  • Retrieve all R-Ladies groups separately, which was necessary to avoid missing some groups.

Procedure

For this dashboard, the following procedure was followed:

  1. We used the meetupr package to extract R user groups from Meetup.com
  2. Improved the existing find_groups() and get_events() functions in meetupr to meet our requirements and switched from the defunct Meetup API keys to OAuth 2.0 authentication system. This switch was quite complicated and will be discussed further in another article.
  3. Transformed the data retrieved from Meetup  via meetupr from data frames to JSON, GeoJSON and CSV
  4. Stored the data by committing the JSON/GeoJSON/CSV files to the GitHub repository of the project.
  5. Developed a static HTML dashboard interface based on an open-source Bootstrap template
  6. Rendered the stored data via the dashboard interface
  7. Automated the process of extracting R user groups, data transformation and storage.
  8. Deployed the dashboard via GitHub Pages

The Tools We Used

Combining R (for data-analysis) and JavaScript (for data-presentation) is at the heart of this project as this combination offers great flexibility with automation and deployment.

We used a mix of these tools to develop the dashboard:

  1. R, RStudio and the following packages:
  • meetupr, curl, jsonlite and leafletR
  1. Javascript and the following libraries: jquery.js, d3.js, echarts.js, leaflet.js, leaflet-markercluster.js and lodash.js
  2. Gentelella Admin Dashboard Bootstrap HTML template
  3. Travis CI to automatically build the project, execute R scripts and bash commands
  4. Bash commands to call R scripts and commit modified files to GitHub

Acknowledgments

We appreciate Curtis Kephart (RStudio) for contributing code that helped us with ideas on identifying R user groups on Meetup.

We also thank the authors of the meetupr package for their excellent work. Special thanks to Jenny Bryan, Erin LeDell, and Greg Sutcliffe for their help over the last month with implementing the requirements for the new Meetup OAuth 2.0 authentication system.

New R Consortium Blog Guidelines

By Blog

The R Consortium is posting new blog guidelines to help facilitate posts from members, ISC grants recipients and the community at large. Please review and send in your ideas!

R Consortium Blog Overview

The R Consortium blog will serve as a channel for the members, ISC grant recipients and the community at large to broadcast to a wide audience how their work and engagement is growing opportunities for the R language for data science and statistical computing.

This may include summaries of how leading institutions, companies and developers are using, developing and advancing R. 

Those involved with developing, maintaining, distributing, and using R software are encouraged to contribute to the blog. 

Guest posts from the R Consortium community at large or projects funded by the ISC that enhance R and support users are welcomed. Updates about R-related conferences (including useR!), meetings (including SatRDays and RLadies), local user groups worldwide, new working groups or programs for R language certification and training are of interest. Other topics would certainly be considered, but it should be something of interest to the broader R community. 

Accepted blog posts are at the sole discretion of the R Consortium.

Quality

We are looking for posts that teach and give value to our community. Blogs should include the meta-narrative that “R is a fast-growing language for statistical computing and graphics” and “the R Consoritum supports the worldwide community of users, maintainers and developers of R software.”  

Guest posts must be vendor neutral, though it may mention vendors involved in specific deployment or adoption paths, or their hosting of an in-person event or speaking at an event, or other indications of meaningful participation in the community. It shouldn’t feel like an advertisement for your product, services or company though. Your post must be your content, but can be published elsewhere on the Internet with permission from that website. All content should have a byline (preferably by a company engineer) and be published Creative Commons with Attribution, so you’re welcome to re-publish on your own blog.

The most interesting posts are those that teach or show how to do something in a way maybe others haven’t thought of. Good blog posts show hurdles that were encountered and explain how they were overcome (not that everything is rainbows and unicorns). When showing upstreaming of a patch fixing an issue for others, link back to the Github issue, so readers can follow along. We don’t avoid critical commentary or broad issues, but approach them with sensitivity, professionalism and tact in a way that is beneficial and positive for the community. It would be helpful to the R Consortium to discuss how to choose between different technologies and how to accommodate different legacy issues and cloud platforms. 

Be interesting and inspiring! 

Promotion

Your blog will be shared on R Consortium’s Twitter channel. Please feel free to retweet or share. Don’t forget to share your work on your own social channels and favorite news aggregator sites. Suggested sites: Twitter, LinkedIn, Reddit, Hacker News, DZone, TechBeacon. Plus industry sites like: https://www.r-bloggers.com/about/, rweekly.org and reddit.com/r/Rlanguage.

How to submit for consideration

Please submit the blog post or a brief summary and the topic of the post to R-marketing@lists.r-consortium.org (with the Subject line: “Proposed Blog: BLOG TITLE”) for consideration. The PR team will review your submission in a timely manner and provide the green light to draft the entire article or provide feedback on next steps. If you are submitting an article or presentation that already exists, please send it in its entirety with a note on the expressed permission from the owner of content. Once your submission has been approved, it will be added to our blog publishing calendar and a publish date will be provided, so you may plan to promote accordingly through your personal and company social media channels. Blog posts should be no longer than 1,000 words and no shorter than 300 words. Diagrams, code examples or photos are strongly encouraged.

$50,000 in New Grants Approved

By Blog

The R Consortium actively supports new projects to help R development both technically and organizationally. Improving R infrastructure and building for long term stability are key goals of the R Consortium. These types of support cannot be matched by individual companies. 

The newest three projects that have been awarded grants have been announced. Congratulations to R-global, R ecosystem for meta-research, and R Community Collaboratives. These ambitious projects cover two technical areas – focusing on geographical coordinates and evidence synthesis – as well as resources and support to facilitate on-the-ground organization of community R events.

In total, over $50,000 in new grants were approved.

More projects will be funded soon. Is your R project one of them? See below for more information on applying for funding.

R-global: analysing spatial data globally

Edzer Pebesma (edzer.pebesma@uni-muenster.de)

https://github.com/r-spatial/global/

Currently, a number of R spatial functions assume that coordinates are two-dimensional, taken from a “flat” space, and may or may not work for geographical (long/lat) coordinates, depicting points on a globe. This project will try to make such functions more robust and helpful for the case of geographical coordinates. It will reconsider the concept of a bounding box, and build an interface to the S2 geometry library (http://s2geometry.io/), which powers several modern systems that assume geographic coordinates.

Expanding the ‘metaverse’; an R ecosystem for meta-research

Martin Westgate (martin.westgate@anu.edu.au)

https://rmetaverse.github.io

Evidence synthesis is the process of identifying, collating and summarizing primary scientific research to provide reliable, transparent summaries such as systematic reviews and meta-analyses. Despite their importance for linking research with policy, however, evidence synthesis projects are often time-consuming, expensive, and difficult to update. Open and reproducible workflows would help address these problems, but these workflows are poorly supported by the current package environment, preventing access by new users and hindering uptake of the well-developed suite of statistical tools for meta-analysis in R. The metaverse project will integrate and expand tools to support evidence synthesis and meta-research in R; suggest flexible workflows to complete these projects in a straightforward and open manner; and provide a collector package allowing easy access to these developments for new and experienced users.

R Community Collaboratives

Angela Li (angela@angelalidata.com)

https://github.com/unconf-toolbox

Previously known as the Unconf Toolbox, R Community Collaboratives provide resources and support to facilitate on-the-ground organization of community events. These events engage individuals in the R community through in-person collaboration on open source projects. R Collabs emphasize learning and mentorship, encouraging R users to become R developers. They are inspired by the unconference organized by rOpenSci, but are designed to encourage local organizers to put on events for their own community. To do so, this project develops useful technical and logistical infrastructure for R Collab organizers. These include a website template, an organizing handbook, and a project dashboard for reporting out.

Join the Grant Program!

Strengthening the R community by improving infrastructure and building for long term stability is one of the primary focuses of the R Consortium. To achieve this, the R Consortium’s Infrastructure Steering Committee (ISC) has developed a grant program to fund development of projects that broadly help the R community.

Everyone is encouraged to apply, regardless of experience or expertise!

For a description of the types of projects that are being funded, examples of previous projects, and more, please see our information here: https://www.r-consortium.org/projects/call-for-proposals

R Community Explorer

By Blog

by Ben Ubah, Claudia Vitolo and Rick Pack

Introduction

One of the most important qualities of the R Language is its thriving community. The R community has a reputation for being particularly friendly, welcoming and cohesive, which has enhanced its adoption and expansion. R user groups have accordingly flourished, especially in recent years.

In this year’s Google Summer of Code program, the proposal, “Data-Driven Exploration of the R Community” was selected. For this, the project’s developer, Ben Ubah, thanks the project’s mentors, Claudia Vitolo and Rick Pack for their contributions.

The primary motivation for this project was the need to have a consistent, data-driven, automated dashboard that provides a broad overview of global R User Groups and R-Ladies Groups.

The R Consortium and other stakeholders have invested in community expansion and sustenance initiatives like R-Ladies, R User Group Support (RUGS) program, Event Sponsorship, RCDI-WG and SatRdays.These promote the learning and adoption of R in many under-represented regions. They have also significantly enhanced community engagement.

As the R community has progressed, there does not appear to have arisen a way to track its global user groups’ inception and activity. Is there a way to find out which regions require more representation? How do we recognize the efforts of organizers who put in a lot of effort to organize events that sustain user groups? How do we easily locate and recognize the most active groups and perhaps learn from their successes? Could we somehow ascertain the impact of the initiatives set by the R Consortium and others on a global scale? Could there be a unified platform dedicated to exploring the R community in an open-ended curiosity-driven fashion? These were the thoughts that inspired this project.

While this project is in its infancy, we have started seeing some encouraging results after the first coding phase of Google Summer of Code. It is our hope to share with you what we have achieved so far and receive welcomed feedback, if you are so inclined.

R-Ladies Groups

Since the R Consortium first funded the R-Ladies initiative, there has been a sporadic diffusion of their chapters and members globally. Perhaps partially as a result of having a consistent leadership compositon and funding, R-Ladies groups are mostly managed on meetup.com, and share a common naming convention. This makes it quite easy to find them on meetup.com and explore their data from the meetup API.

Chart showing Growth of R-Ladies Groups over the years

In the first phase of Google Summer of Code, this project explored a way to track R-Ladies Groups globally from the meetup API, using the meetupr package developed by R-Ladies.

This exploration was intended to be completely data-driven, automated but rendered via a static dashboard that would be hosted via GitHub Pages. R-Ladies already have a shiny dashboard, which only runs on a Shiny Server. Inspired by that dashboard, we developed one with some useful differences such as faster loading, additional aesthetic features such as thematic coloring, and additional tabular displays, charts and counts.

What Has Been Achieved

For the R-Ladies dashboard, the following were achieved:

  1. We used the meetupr package to extract R-Ladies Chapters from Meetup.com
  2. Improved the existing find_groups() and get_events() functions in meetupr to meet our requirements
  3. Transformed the data from Meetup to required formats
  4. Persisted the data on GitHub
  5. Developed a static HTML dashboard interface based on open-source Bootstrap template.
  6. Rendered the persisted data via the dashboard interface.
  7. Automated the process
  8. Deployed it via GitHub Pages

The Tools We Used

To accomplish the following, we used a mix of the tools listed below:

  1. R, RStudio and the following packages: meetupr, curl, jsonlite and leafletR
  2. Javascript and the following libraries jquery.js, d3.js, echarts.js, leaflet.js and lodash.js
  3. Gentelella Admin Dashboard Bootstrap HTML template
  4. Travis CI to build the project, execute R scripts and bash commands
  5. Bash commands to call R scripts and commit modified files to GitHub

How We Achieved it

  1. We used the meetupr package to retrieve R-Ladies Groups from meetup.com with an R script.
  2. We further analyzed this data and computed several summaries out of it. We used the leafletR package to transform our data frame to GeoJSON. We used this GeoJSON file to create a leaflet map using leaflet.js. In this map, R-Ladies groups are separated into three groups with markers of three color categories: Active (purple), Inactive (dark-purple), and Unbegun (orange). Active groups have had an event in the past 180 days or have an upcoming event in the future. Inactive groups have not had an event in the past 180 days and do not have an upcoming event. Unbegun groups have not had an event in the past and none are planned for the future.
  3. Persisted all data and our summaries in CSV / JSON files. After each Travis build, the data and our summaries gets updated straight from the Meetup API.
  4. We wrote bash commands to run our R scripts, and commit updated CSV / JSON files to GitHub after every Travis build.
  5. We setup Travis Cron Jobs, to build this project daily and update our data.
  6. We then, customized the Gentelella Admin Dashboard Bootstrap HTML template to our requirements.
  7. Rendered our summaries via widgets on this dashboard. Used Javascript/libraries to perform other simpler summaries and produce maps, charts and tables.

The Result

At the end we have an open-source dynamic dashboard for R-Ladies that is updated daily, but is built to be static and hosted via GitHub Pages. This could be seen as another approach to building information dashboards with R as a back-end technology, maintaining separation of business data-processing from data-presentation.

At the time of writing, there are 165 R-Ladies chapters composed of 50,000 + members, across 47 countries, 162 cities, with more than 1,580past events and many upcoming. 71% of R-Ladies chapters are active, 13% are inactive, and 16% are unbegun. Unbegun groups have members but have not started organizing events yet. Our observation is that members are added to the R-Ladies community daily.

The pop-up markers in the leaflet map display important information about each R-Ladies chapter including a link to the group’s webpage, number of events, status, inactive months, and how to become an organizer for inactive/unbegun groups.

Feedback

We are just starting this project and are in hopes of expanding its reach far beyond its current state. We would love to hear from you if you have any ideas or find issues. Feel free to Follow / Star the project at its GitHub repo: https://github.com/benubah/r-community-explorer/

Next

We have started working on general R user-groups and plan to report our progress soon with some lessons we have learned.