All Posts By

R Consortium

March 2020 ISC call for proposals – Now Open!

By Announcement, Blog

The March 2020 ISC Call for Proposals is now open. Once again, we are looking for ambitious projects that will contribute to the infrastructure of the R ecosystem and benefit large sections of the R community.  Our goal is to stimulate creativity and help you turn good ideas into tangible benefits. 

It is very likely that everyone who reads this post will be reorganizing aspects of their everyday lives to cope with the challenge of the Covid-19 virus. Accordingly, we are suggesting a theme for this call for proposals: What can we do to improve the R infrastructure for locating, accessing, cleaning and reporting on data related to the epidemic that will be useful now and in the future?

In the recently published post COVID-19 epidemiology with R, researcher Tim Churches highlights some of the challenges presented in acquiring accurate “real time” data. These include locating sources, writing code to scrape Wikipedia, a site whose structure may change every time it is updated, digging out data embedded in multiple different languages etc and providing mechanisms for researchers to store data, share code and exchange ideas. 

But don’t be constrained by the theme. There is other work that needs to be done and we want to hear about ideas that we may be able to facilitate.

As always, “Think Big” but structure your proposal with intermediate milestones. The ISC is not likely to fund proposals that ask for large initial cash grants. We tend to be conservative with initial grants, preferring projects structured in such a way that significant initial milestones can be achieved with modest amounts of cash.

As with any proposed project, the more detailed and credible the project plan, and the better the track record of the project team, the higher the likelihood of receiving funding. Please be sure that your proposal includes measurable objectives, intermediate milestones, a list of all team members who will be contributing work and a detailed accounting of how the grant money will be spent.

To submit a proposal for ISC funding, read the Call for Proposals page and submit a self-contained pdf using the online form. You should receive confirmation within 24 hours.

The deadline for submitting a proposal is midnight, April 2, 2020.

R Consortium Welcomes New Member ThinkR, R Language and Data Science Engineering Company

By Announcement, Blog

Services include R consulting, development, and training; contributes to multiple R open source projects including golem, framework for building robust Shiny apps

SAN FRANCISCO, March 3, 2020 – The R Consortium, a Linux Foundation project supporting the R Foundation and R community, today announced that ThinkR has joined the R Consortium as a Silver Member. ThinkR provides R engineering, training, and consulting, and is based in France. 

“We provide R Language infrastructure, engineering and training to our clients, and at the same time we believe it is important to give back to the R community by participating in open source projects, holding meetups and training, and promoting R in many ways. Joining the R Consortium will help us to expand our support for R even more, and allow us to work toward building better R infrastructure that helps R developers and our customers,” said Diane Beldame, CEO, ThinkR. “Joining the R Consortium will allow us to better support and promote the R community and that is a big benefit for our clients.”

ThinkR developers devote a part of their time to R and Data Science communities. This includes supporting various R packages on Github, holding meetups and other conferences connected to R, posting development tips on the ThinkR blog, and responding on Stackoverflow and other Slack communities.

“We are excited to welcome ThinkR to the R Consortium. ThinkR is on the front lines of providing R to industries in ways that immediately contribute to their customers’ success,” said Joseph Rickert, RStudio’s R Community Ambassador and R Consortium Board Chair. “At the same time, ThinkR contributes to the R community with open source projects and much more, and we’re very pleased they will be involved in moving the R Consortium forward.”

ThinkR has clients in a wide range of industries including public institutions, Pharmaceutical, Energy, Banking, Electronics Manufacturing, Research, and more. 

ThinkR Resources

About The R Consortium 

The R Consortium is a 501(c)6 nonprofit organization and Linux Foundation project dedicated to the support and growth of the R user community. The R Consortium provides support to the R Foundation and to the greater R Community for projects that assist R package developers, provide documentation and training, facilitate the growth of the R Community and promote the use of the R language. For more information about R Consortium, please visit: http://www.r-consortium.org.

About Linux Foundation 

Founded in 2000, the Linux Foundation is supported by more than 1,000 members and is the world’s leading home for collaboration on open source software, open standards, open data, and open hardware. Linux Foundation projects like Linux, Kubernetes, Node.js and more are considered critical to the development of the world’s most important infrastructure. Its development methodology leverages established best practices and addresses the needs of contributors, users and solution providers to create sustainable models for open collaboration. For more information, please visit us at linuxfoundation.org

# # #

R Community Explorer – Google Summer of Code Projects

By Blog

By Benaiah Ubah, Claudia Vitolo and Rick Pack

Introduction

Google Summer of Code (GSoC) is an annual 3-month open-source software development (coding) program that provides a platform for mentors and students (mentees) to collaborate on open source projects. This article highlights our accomplishments in the final coding phase of the 2019 GSoC project: “Data-Driven Exploration of the R Community”. The first part of the project explored R-Ladies chapters, the second part explored all R user groups available through Meetup.com and in the last phase we explored Google Summer of Code projects under the R Project over the past 12 years.

What We Achieved

1. Aggregating all R-GSoC projects into a CSV file presenting names of students, mentors, and projects and computing summaries for students, mentors, and projects and storing them in JSON format

2. Assigning all 215 R-GSoC projects into a work-product category among: Package, Infrastructure, Data, Database, GUI, Visualization, Documentation and Application

3. Updating the names of students and mentors to maintain consistency – some names are abbreviated, some are just Google user names and others appear differently across projects

4. Charting work-product distribution using grouping functions from d3.js and charting functions from echarts.js

5. Building a dashboard using similar tools described in our article here

6. Creating a word-cloud from the projects’ topics using d3.js and d3-layout.cloud.js libraries, and charting the top 20 frequent words

While you may not read about R-Google Summer of Code (R-GSoC) activities every day via blog posts and Twitter, many important R contributions have emerged from R-GSoC activities. Example past R-GSoC projects include enhancements of Toby Dylan Hocking’s animint [animated interactive plots] package and statistical modeling R packages like the Stan-using BayesHMM.

This is a screenshot from our R Community Explorer’s “Past GSoC R Projects” section:

Our previous articles discussed our celebration of R-Ladies and the general R User Group community through open-source dashboards that highlight the growth, geographical distribution, and activity of the R community on Meetup. We hope applying this similar approach to exploring R-GSoC projects will encourage more R-GSoC proposals, increase consideration of prior projects, and attract more participants to the R ecosystem.

Dashboard Summaries

   The following summaries are displayed on the dashboard:

+ Most active mentors

+ Students returning as mentors

+ Students returning for another GSoC

+ Counts and averages of projects, students and mentors

+ Count of projects co-mentored by former GSoC students

+ Work-product distribution

The dashboard could be found at this link: https://benubah.github.io/r-community-explorer/gsoc.html

A Few Highlights

Google has funded 215 R projects, accomplished by 189 students and 202 mentors in the past 12 years of GSoC. The number of projects is quite significant – thanks to Google’s generosity towards the R-Project by giving them adequate GSoC slots each year.

The word-cloud and bar chart of top 20 words on the dashboard show that in the past 12 years of Google Summer of Code under R, data analysis, package development/enhancement and biodiversity applications have been the most popular. Modeling, interactive visualization, optimization and performance improvement have also taken top positions within GSoC projects.

From 2013, the number of mentors per year at least doubled the number of participating students. This is as a result of policies by the R org admins who require at least two mentors for each project so as to reduce student failure rates and improve mentor availability throughout the program.

25 Google Summer of Code students (13% of all students) under the R-Project have returned as mentors and they have co-mentored about 72 projects (33% of all projects) in the past 12 years

Future Directions

The R-Project participated in the Google Code-In contest for the first time in 2019 and we are glad to explore the resulting data and report our findings. We generally hope that aggregating and reporting activities around many popular and unpopular aspects of the R language will bring greater visibility to the hard work of several contributors, highlight opportunities around Google programs, and continue to give the global R community a feel of the popularity of R over the years.

Building the R Community in Southern Africa

By Blog

By Heather Turner, Chair of Forwards, the R Foundation taskforce for underrepresented groups in the R Community

In this post I will give the background to the Forwards Southern Africa 2020 project, for which we are running a crowd-funding campaign until February 5, 2020.

On March 6-7, 2020, Johannesburg will host the fourth satRday to be held in South AfricasatRdays are community-led, regional conferences, that support collaboration, networking and innovation within the R community. They were initiated by an R Consortium funded project, that ran pilot events in Budapest and Cape Town in 2016/2017. The conference series has been expanding around the world since then, with ten events in 2019.

For Joburg satRday 2020, I was invited to be a keynote speaker. As chair of Forwards, the R Foundation taskforce for underrepresented groups, I saw this as an opportunity to create an initiative focused on building the R Community in Southern Africa.

A first step was to offer a workshop on R package development, using the materials developed under the R Consortium project, Forwards Workshops for Women and Girls. This project ran package development workshops for women in New Zealand, Budapest and Chicago. Since there are still some funds left in the grant, we are able to offer some scholarships to women in Africa to attend the Joburg workshop and satRday. Women with visa-free access to South Africa may apply; the deadline for applications is midnight SAST, January 31.

The next step was to look beyond South Africa, to neighbouring countries. The following map shows cities in Africa with R-Ladies groups (purple), R User Groups (blue) or both (blue-grey):

The AfricaR consortium that took off at the start of 2019 has really helped to support the R community across Africa and has lead to the founding of several R User Groups, as well as the first satRday in East Africa (Kampala 2019) and the first satRday in West Africa which will take place in Abidjan, February 1, 2020. In Southern Africa, there are strong R User Groups and R-Ladies groups in both Cape Town and Johannesburg, but the R Community is only just starting to go beyond South Africa, with the establishment of Eswatini useRs last year.

UPDATE: The Adidjan satRday event was a big success! Here’s a photo of the full group. Videos of the talks should be available online soon.

The Forwards Southern Africa Project aims to build on this foundation, by organizing free workshops and meetups in collaboration with local partners in Eswatini, Botswana and Namibia. This project is also supported by the WhyR Foundation and AfricaR. The details of the events are still being finalised, but the planned itinerary is as follows:

Windhoek, Namibia (March 4, 2020, TBC)

In partnership with the Department of Statistics and Population Studies, University of Namibia:

  • Introduction to R for data analysis workshop (1 day)
  • Launch event of the first R User Group in Namibia

Manzini, Eswatini (March 11-12, 2020)

In partnership with the recently established Eswatini useR group. Registration is open for this 2 day event, that includes:

  • Introduction to R for data analysis workshop (1 day)
  • Data visualization workshop (1/2 day)
  • Meetup including talk on the R community and resources available for newcomers

Gaborone, Botswana (March 14, 2020)

In partnership with WiMLDS Gaborone and PyData Botswana:

  • Introduction to R workshop (1/2 day)

All these events can be supported via the crowdfunder where further updates will be posted. Updates will also be shared on the Forwards Twitter.

R Consortium Simplifies Membership Structure to Increase Opportunities for Silver Level Members

By Announcement

Two membership levels now available: Platinum and Silver

SAN FRANCISCO, January 29, 2020 – The R Consortium, a Linux Foundation project supporting the R Foundation and R community, today announced a shift in its membership structure, allowing increased opportunities for Silver Level members. The new approach simplifies the membership structure to streamline the R Consortium organization, recruit more members, and attract expertise to technical committees and working groups. It is aimed at organizations that are interested in participating in R Consortium governance and helping steer direction of R language infrastructure development and events participation around the globe.

“We hope that simplifying the R Consortium membership structure will enable even more companies that benefit from the R language to join the R Community and help expand R’s unique contributions to statistical computing and open source data science,” said Joseph Rickert, R Community Ambassador, R Studio, and R Consortium Board Chair.

Full membership details, including benefits and pricing, are available here: https://www.r-consortium.org/about/join

“The R Consortium is strengthening the R community by improving infrastructure and building for long term stability. The more voices involved, the broader the support becomes. We wanted to explicitly reach out to companies and organizations that want to become members and care about the future of R. With this simplified structure, it is easier than ever to join our technical committees and working groups,” said Hadley Wickham, Infrastructure Steering Committee Chair, R Consortium. “The R community continues to grow and expand, and the R Consortium is making sure we accommodate that. We are already pleased with the increase in new membership inquiries. 2020 will be an exciting year for R development.”

How R Consortium membership helps support the R Community:

  • Supports operations of critical infrastructure that sustains the R ecosystem, anticipating challenges and planning contingencies 
  • Funds open-source community-driven projects that widely impact the R Community
  • Provides opportunities to form working relationships that transcend company affiliations
  • Enables better identification and recruitment of data scientists working in R
  • Builds freely available infrastructure, leveraged by citizen science, expanding participation

About The R Consortium

The R Consortium is a 501(c)6 nonprofit organization and Linux Foundation project dedicated to the support and growth of the R user community. The R Consortium provides support to the R Foundation and to the greater R Community for projects that assist R package developers, provide documentation and training, facilitate the growth of the R Community and promote the use of the R language. For more information about R Consortium, please visit: http://www.r-consortium.org.

About Linux Foundation

Founded in 2000, the Linux Foundation is supported by more than 1,000 members and is the world’s leading home for collaboration on open source software, open standards, open data, and open hardware. Linux Foundation projects like Linux, Kubernetes, Node.js and more are considered critical to the development of the world’s most important infrastructure. Its development methodology leverages established best practices and addresses the needs of contributors, users and solution providers to create sustainable models for open collaboration. For more information, please visit us at linuxfoundation.org

# # #

Uniting Local R Users in Spain – Users Murcia R (UMUR)

By Blog

By Aurora González-Vidal (president) and Antonio Maurandi (vice-president), Users Murcia R (UMUR)

UMUR (Users Murcia R) is an association whose first official act was the organization of the X National Spanish meeting of R users in Murcia (2018) which established an inflection point in this annual meeting. We brought two amazing speakers: François Husson, who accompanied us, and Julia Silge, who participated by video call. Since then, we have been doing meetings every other month (workshops, talks…) with attendance of 35-45 people. We are trying not only to unite local people but also to provide them the chance to meet their references in the R community and make them participants in the R spirit. Recently we also had the opportunity to meet Max Kuhn, at the XI National meeting of R users celebrated in Madrid (2019)

Our secret as a young organization achieving great numbers of participation is that we are formed at least by 2 small groups and individuals that were independently “spreading the word of R,” each of them in their own environments.

The first informal group is called 00Rteam. We are based in academia and teach a large number of R courses aimed at specific audiences at the university level: PhD students (writing scientific papers with Rmarkdown, introduction to R and Rstudio, data tabulation in R, hypothesis contrast in R, multivariate analysis in R), teachers (automatic learning with R) and administrative staff (R4U). 

Another informal group is made up a group of mathematicians and economists from the Faculty of Economics and Business, who are working on integrating R in their daily teaching. They use R and RStudio to create interactive pedagogic materials by exploring packages like rmarkdown, shiny, swirl and exams for teaching Statistics on the courses offered by the faculty.

Apart from that, our board of directors includes professionals who are actively spreading R in engineering businesses and banks and who have personal blogs and have authored manuals about R. This mix of interdisciplinary and enthusiastic people being in charge of an association has been able to attract a pool of interested people that is bringing us a lot of joy and knowledge interchange in Murcia. 

An important characteristic of UMUR is that the number of women on the board is higher than the number of men. We are sensitive to gender equality, and we want to be an example of parity in the technology space showing that we are diverse in many ways. We think this fact is the key to success. 

Signed: Aurora González-Vidal (president) and Antonio Maurandi (vice-president)

XI Conference of R Users (Madrid, Spain, Nov 14-16) Welcomes Over 200 Attendees

By Blog, Events

Thank you to Carlos Ortega, Principal Data Scientist, Teradata, for providing this summary and pictures from the conference

The XI Conference of R Users (XI Jornadas de Usuarios de R), held November 14 – 16, Madrid, Spain, was organized by the Asociación Comunidad R Hispano. The ambitious program and the invited international speakers made the participation massive, exceeding 200 attendees. The Conference was divided into two locations, Repsol (Spanish Gas and Oil company) and UNED (Spanish Distance Learning University), highlighting the university-business combination that has been one of the key factors in the success of the conference.

On Thursday, November 14, the opening ceremony was held at the Repsol Campus auditorium and attended by Emilio López Cano (president of the Asociación Hispano R Community), Julio Gonzalo (deputy vice chancellor for research at UNED), Enrique Dameno (Director of Digitalization and Integrated Customer Management of Repsol), and Teresa García (Repsol).

Max Kuhn (R Studio) gave a lecture on “Modeling in the Tidyverse,” and after that, in the round table “R in business,” the crucial role of data scientists in solving problems in diverse areas was covered. Raúl Vaquerizo (Pont Group), Noelia Ruiz (Mutua Madrileña), Jorge Ayuso (Telefónica España), Enrique Lasso (Repsol) and Carlos Ortega (Teradata) participated in the round table.

On the 15th and 16th, at the School of Education of the UNED, an extensive and vibrant program was developed with workshops, communications sessions, “lightning sessions,” poster sessions, round tables and invited conferences. Bernd Bischl (University of Munich) gave a lecture on MLR3, Jo-Fai Chow (H2O.ai) presented “Automatic and explainable machine learning in R,” and Max Kuhn gave a workshop on “Designing R modeling packages.”

Following the multidisciplinary philosophy of using R to handle any kind of data, communications sessions dealt with applications in genetics, data analysis, model and project management, society and culture, surveys and education, medicine and veterinary and economics and company. In addition to these monographic sessions, the “lightning sessions” dealt with many different topics.

A round table on Data Journalism was held to close the conference, moderated by Leonardo Hansa (R-Hispano) in which Virginia Peón (Indigitall), Alba Martín (Newtral), Antonio Delgado (Datadista) and Carmen Aguilar (Sky News) participated. The importance of knowing how to treat the data in an appropriate and honest way was highlighted, so that information that reaches the public is truthful.

In the closing ceremony, the prize for the Best Young Work of the Conference was announced, which went to Rocío Aznar Gimeno (Technological Institute of Aragon) for the work “Multilevel mixed models: An application of the lme4 library to estimate the fetal weight percentile in twin pregnancies.”

Sessions Available

Many of the sessions were streamed and recorded. They are accessible through the UNED Channel (Canal UNED): https://canal.uned.es/series/5dc3f7d05578f252041fc22d

R Consortium Infrastructure Steering Committee Chair Wins 2019 COPSS Presidents’ Award

By Announcement, Blog

Congratulations to our very own Hadley Wickham, Infrastructure Steering Committee Chairperson, for winning the “Nobel Prize of Statistics.” The award is given to a person under the age of 41, in recognition of outstanding contributions to the profession of statistics. According to Wikipedia, the COPSS Presidents’ Award, along with the International Prize in Statistics, are considered the two highest awards in Statistics.

The award citation recognized Wickham’s “influential work in statistical computing, visualization, graphics, and data analysis” including “making statistical thinking and computing accessible to a large audience.”

In previous years, the award has primarily recognized theoretical contributions to statistics. This year is the first time it has been awarded for practical application.

Hadley is Chief Scientist at RStudio, a Platinum member of the R Foundation, and Adjunct Professor at Stanford University and the University of Auckland. The skills with statistics runs in the family: his sister is an Assistant Professor of Statistics at Oregon State University.

Hadley builds tools – both computational and cognitive – to make data science easier, faster, and more fun. His work includes packages for data science – a pioneering a suite of tools for R known as the “Tidyverse”: including ggplot2, dplyr, tidyr, purrr, and readr – and principled software development (roxygen2, testthat, devtools). He is also a writer, educator, and speaker promoting the use of R for data science. Learn more on his website, http://hadley.nz.

Congratulations, Hadley!

Data-Driven Tracking and Discovery of R Consortium Activities

By Blog

by Benaiah Ubah

R is a fast-growing language for statistical computing and graphics backed by a powerfully inclusive community of users and developers. The R community received a significant boost when some enterprises came together to establish the R Consortium in 2015. Since then, the R Consortium has clearly proven its purpose by operating transparently and in an unbiased manner – supporting the R Foundation, infrastructure that broadly affects the R community, tools that enhance the R software, R user-groups, events and diversity on a global scale. R Consortium’s top level projects – R-Hub, R-Ladies, the RUGS program, Events sponsorship, the R Community Diversity and Inclusion program – , working groups and other ISC funded projects highlight the significance of R Consortium’s involvement as a major supporter of several critical developments around R in recent times.

To further enhance transparency, measure impact and achieve even greater community inclusiveness, the R Consortium in Fall 2018, funded a new data-driven initiative to provide a way for the R community to discover and track its activities over the years. This infrastructure is dedicated to curating and rendering R Consortium activities via dashboards using open-source technologies – all data and code are available at this GitHub repository that is primarily maintained by me.

For a start, the ISC approved the development of dashboards that highlight R Consortium’s accomplishments, with a focus on ISC Funded Projects, RUGS program and Events/Marketing program. I am delighted to communicate that, this initial scope has been successfully covered and the corresponding milestones delivered. The next iteration of development would include more aspects of R Consortium’s activities that have broad impact on the larger R community. The following sections of this article presents reasons why a data-driven initiative is useful for tracking R Consortium’s activities, the deliverables for this project, benefits and future directions.

Why a data-driven initiative to track R Consortium activities?

1. In the past 5 years, R Consortium has supported many R initiatives that encompass user-groups, events, diversity, technical infrastructure, documentation, developing teaching materials, working groups, etc But, how could the impact of these initiatives be measured in numbers over the years? How could the global distribution of activities like, the user-group and event support programs be ascertained?

2. ISC funded projects (both completed and ongoing) are usually curated on a single web page.  This initiative provides a way for searching for these projects by year, grant-cycle, status, primary investigator, etc. 

3. Before embarking on this project, there was no way of ascertaining the distribution of funding across work-products.  A data-driven infrastructure will help those without experience applying for ISC grants, by giving them an overview of work-products and cash-grant ranges that have received more funding over time.

4. R Consortium’s decision makers may find a data-driven initiative helpful in planning future programs and packages.

5. Prospective R Consortium members that are contemplating joining the R Consortium, could easily find and understand R Consortium’s past accomplishments in a broad, transparent, insightful, and aggregated manner.

6. Finally, comparing R Consortium’s mission statement with its accomplishments from a data-driven perspective, is something that the R Foundation, the global R community, present and future members of the R Consortium would like to track and provide feedback on over time, for the long-term growth and stability of the R ecosystem.

Project Deliverable

We  now present to the R community, a suite of dashboard pages that render the corresponding R Consortium activities  in a data-driven manner:

  1. ISC funded projects dashboard
  2. R User Group Support program dashboard
  3. Events / Marketing program dashboard
  4. A landing dashboard page that summarizes details from other dashboard pages for enhanced user experience.
  5. A GitHub repository to find all code and data for this infrastructure.

Benefits

  1. ISC projects dashboard: Easily find ISC projects with enough information to contact project owners for those thinking of contributing to projects. Find most popular work-products ad cash-grant ranges for those without experience applying for grants.
  • RUGS program dashboard: Understand the global distribution of funded user-groups and their funding-level distribution. Find information about these groups and how to get in touch with those within your reach.
  • Events / Marketing dashboard: Understand the global distribution of sponsored events.
  • Landing dashboard: Find aggregated summaries around all of ISC projects, RUGS program, Events/Marketing program and the R-Ladies project.

Future Directions

It would be interesting to explore more of R Consortium activities like working groups, and ISC projects that have observable global impact on a running basis.

Join R Consortium

If you are an enterprise that benefits from using the R environment, please consider joining R Consortium to make the R ecosystem a better one.

Acknowledgments

I appreciate the contributions that came from John Mertic, Hadley Wickham and Joseph Rickert especially at the initial phases of this idea.

Get Funded by the R Consortium – Call for Proposals Open Now!

By Blog

Strengthen the R community with Your Project

The R Consortium is committed to supporting the R community by funding projects that create important infrastructure and fortify long term stability for the R Community. The R Consortium’s Infrastructure Steering Committee (ISC) has developed a grant program that looks to help the broader R community.

The Call for Proposals opens today, September 13, 2019, and runs for a full month, through October 14, 2019.

This is the fourth year of funding, and over $1,000,000 has been given out in sponsorships and grants.

We encourage you to apply, even those without experience applying for grants.

Apply now!

In this round, the ISC is looking for projects that:

  • Are likely to have a broad impact on the R community.
  • Have a focused scope. Simple is better than over-ambitious. Larger projects can often be broken up into smaller steps.

The process for submitting a proposal has been has been updated annually to ensure that the process is as smooth as possible. Full details on proposal requirements, examples of previous projects, suggestions for what to avoid, and more, are included here.

Any questions about the proposals or submission process, please write to proposal@r-consortium.org

Apply now!