May 29

Announcing the R Consortium ISC Funded Project grant recipients for Spring 2018

By John Mertic Announcement, Blog, News

The R Consortium supports the R Community through investments in sustainable infrastructure, community programs and collaborative projects. Through the The Funded Project Program, now in it’s fourth year, the R Consortium has invested more that $650,000 USD in over 30 projects that impact the over 2 million R users worldwide.

We are pleased to announce the Spring 2018 grant recipients. We will provide updates on these projects throughout the year. Congratulations to all grant recipients, and look forward to our session at useR!2018 this July where many of our funded projects will showcase their work and tips for leveraging the grant program for driving open collaboration.

Maintaining DBI

Grantee: Kirill Müller

DBI, R’s database interface, is a set of methods declared in the DBI R package. Communication with the database is implemented by DBI backends, packages that import DBI and implement its methods. A common interface is helpful for both users and backend implementers.

The Maintaining DBI Project which follows up on two previous projects supported by the R Consortium will provide ongoing maintenance and support for DBI, the DBItest test suite, and the three backends to open-source databases (RSQLite, RMariaDB and RPostgres).

Ongoing infrastructural development for R on Windows and MacOS

Grantee: Jeroen Ooms

The majority of R users rely on precompiled installers and binary packages for Windows and MacOS that are made available through CRAN. This project seeks to improve and maintain tools for providing such binaries. On Windows we will upgrade the Rtools compiler toolchain, and provide up-to-date Windows builds for the many external C/C++ libraries used by CRAN packages. For MacOS we will expand the R-Hub homebrew-cran with formulas that are needed by CRAN packages but not available from upstream homebrew-core. Eventually, we want to lay the foundation for a reproducible build system that is low maintenance, automated as much as possible, and which could be used by CRAN and other R package repositories.

Developing Tools and Templates for Teaching Materials

Grantee: François Michonneau

The first-class implementation of literate programming in R is one of the reasons for its success. While the seamless integration of code and text made possible by Sweave , knitr, and R Markdown was designed for writing reproducible reports and documentation, it has also enabled the creation of teaching materials that combine text, code examples, exercises and solutions. However, while people creating lessons in R Markdown are familiar with R, they often do not have a background in education or UX design. Therefore, they must not only assemble curriculum, but also find a way to present the content effectively and accessibly to both learners and instructors. As the model of open source development is being adapted to the creation of open educational resources, the difficulty to share materials due to a lack of consistency in their construction hinders the collaborative development of these resources.

This project will develop an R package that will facilitate the development of consistent teaching resources. It will encourage the use of tools and lesson structure that support and improve learning. By providing the technical framework for developing quality teaching materials, we seek to encourage collaborative lesson development by letting authors focus on the content rather than the formatting, while providing a more consistent experience for the learners.

PSI application for collaboration to create online R package validation repository

Grantee: Lyn Taylor (on behalf of PSI AIMS SIG)

The documentation available for R packages currently widely varies. The Statisticians in the Pharmaceutical Industry (PSI) Application and Implementation of Methodologies in Statistics (AIMS) Special Interest Group (SIG) will collaborate with the R-Consortium and representatives from pharmaceutical companies on the setting up of an online repository /web portal, where validation which is of regulatory standard for R packages can be submitted and stored for free use. Companies (or individual R users) would still be liable to make their own assessment on whether the validation is suitable for their own use, however the online repository would serve as a portal for sharing existing regulatory standard validation documentation.

A unified platform for missing values methods and workflows

Grantees: Julie Josse and Nicholas Tierney

The objective is to create a reference platform on the theme of missing data management and to federate contributors. This platform will be the occasion to list the existing packages, the available literature as well as the tutorials that allow to analyze data with missing data. New work on the subject can be easily integrated and we will create examples of analysis workflows with missing data. Anyone who would like to contribute to this exciting project can contact us.

histoRicalg — Preserving and Transfering Algorithmic Knowledge

Grantee: John C Nash

Many of the algorithms making up the numerical building-blocks of R were developed several decades ago, particularly in Fortran. Some were translated into C for use by R. Only a modest proportion of R users today are fluent in these languages, and many original authors are no longer active. Yet some of these codes may have bugs or need adjustment for new system capabilities. The histoRicalg project aims to document and test such codes that are still part of R, possibly creating all-R reference codes, hopefully by teaming older and younger workers so knowledge can be shared for the future. Our initial task is to establish a Working Group on Algorithms Used in R and add material to a website/wiki.

Interested workers are invited to contact John Nash.

Proposal to Create an R Consortium Working Group Focused on US Census Data

Grantee: Ari Lamstein

The Proposal to Create an R Consortium Working Group Focused on US Census Data aims to make life easier for R programmers who work with data from the US Census Bureau. It will create a working group where R users working with census data can cooperate under the guidance of the Census Bureau. Additionally, it will publish a guide to working with Census data in R that aims to help R programmers a) select packages that meet their needs and b) navigate the various data sets that the Census Bureau publishes.

Apr 19

Love0

Wanted: Your input on the next generation of R-Hub

By josephrickert Announcement, Blog, R Consortium Project

R-Hub, which was originally conceived as a useful tool for R package developers to build and test R packages on a variety of platforms, was the first project funded by the R Consortium. The initial version was released in June 2016. Now that the capabilities of R-Hub have progressed well beyond the proof of concept stage, the R Consortium is looking for ideas from the R community on how we can make it and even more useful for R users.

We would like to know how you think we could improve existing functionality and what new features you would like to see. So far, we have come up with the following list of future goals for R-Hub. We welcome comments and suggestions:

Enable organizations to deploy repositories and build infrastructure locally for use in controlled corporate environments.
Provide a system to manage source code, builds, and binary packages in a repository that offers confidence and trust to R users.
Enable end-users to use packages with confidence by providing tools to assess code pedigree, license, quality, security, and package maintenance for individual packages.
Encourage and enable package developers to provide metadata for their packages to help end users discover packages.
Provide package authors and maintainers a broad testing matrix that works on multiple architectures, operating systems, and R runtime engines.
Provide package developers with feedback required to assess and ensure broad compatibility for their packages.

We would very much appreciate comments on this vision for future development along with your assessment of the current system, including your answers to such questions as:

What value does R Hub provide you today?
What does R Hub not do well?
What other aspects of package development should R Hub add?
How could R Hub best serve the corporate package development, deployment, and management process?
Is there anything that CRAN isn’t providing that you would like to have?

Please send your comments to the following email address: isc@r-consortium.com

Note that you may try R-Hub here.

Apr 16

Love0

What’s new with R Consortium funded projects in Q1 2018

By John Mertic Blog, R Consortium Project

In an effort to provide greater transparency with respect to R Consortium activities, the ISC is initiating process to provide quarterly updates for all R Consortium funded projects. The following is our update for Q1 2018.

Quantities for R

The r-quantities project has reached the first milestone with the design and implementation of an initial working prototype, which can be downloaded and tested from GitHub. Further details about the integration process that was necessary for the units and errors packages, as well as the next steps, were published in r-spatial.

stars: Scalable, spatiotemporal tidy arrays for R

The last full update was in November 2017. Recent activity includes work on merging datasets. Check out the project progress on Github.

Interactive data manipulation in mapview

The project is waiting for Barret Schloerke and the RStudio team to complete updating the leaflet package to leafletjs 1.3.1 which will enable major updates to mapedit. Once this is done, the project will mapedit accordingly and added new features as a response to the leaflet update.

Refactoring and updating the SWIG R module

Planning documents are available on our website.

R User Group Support program

So far this year, the R User Group Support program has disbursed nearly $27,000 in grants to 60 user groups and 9 small conferences. The option to participate in the R Consortium’s meetup.com PRO account has proved to be a very popular benefit. 40 groups have elected to participate so far. You can keep up with the activities of these groups on our Meetup page.

The RUGs program will run through September 30th. Look here for details on how to participate.

Establishing DBI

The “Establishing DBI” project is about to be completed. Schema support in DBI is perhaps the most exciting news. Almost all packages have been updated on CRAN, a few final technicalities need to be resolved. Expect a blog post soon on the new project page.

Joint profiling of native and R code

Unfortunately, pprof (and therefore also gprofiler) were not accepted by CRAN due to missing Go binaries on the build machines. Nevertheless, there has been some adoption by the community: for instance, one user was able to use joint profiling to understand a performance problem in the tidyselect package. Work on the project will resume soon. This will include adding support for OS X and adapting the packages so that they will be accepted by CRAN.

R Documentation Task Force

This project still needs help on implementing methods. To join send an email to Andrew dot Redd at hsc dot utah dot edu, expressing your interests, skills or expertise as it relates to R documentation. Also email if you have ideas or concerns but do not wish to play and active role.

Conference Management System for R Consortium Supported Conferences

The project has completed a thorough evaluation of different open source solutions for managing R conferences, and is now compiling a report to facilitate next steps.

Sat R Days

The second SatRday conference was recently held in Cape Town. New conferences are being scheduled.

R-Ladies

R-Ladies expanded by 20 new groups (7 in the US, 4 in Latin America, 4 in Europe, 4 in Africa and 1 in Asia) in the first quarter of 2018, increasing to more than 90 R-Ladies chapters worldwide.

Apr 12

Love0

Package Licensing: Would the R Community like some help? Feedback from the trenches

By John Mertic Blog, R Consortium Project

Editor’s Note: This post comes from Mark Hornick, who leads the Code Coverage WG and serves on the Board of Directors.

In the Fall of 2017, the R Consortium surveyed the R Community to understand opportunities, concerns, and issues facing the community. Taking into account that feedback, the R Consortium recently surveyed package authors and maintainers on a number of topics surrounding R package licensing. Questions revolved around motivations for choice of license, comfort level in understanding license meaning and implications, importance of corporate adoption of R, and whether guidance on licensing from the R Consortium would be valuable.

While there are a significant number of people in the R community who respond they understand and intentionally choose the license(s) they apply to their package software, a much larger group are unclear about which license to choose and what the implications of that choice are. These implications affect not only the individual package, but the R Community and corporate, government, and academic users of those packages as well. Of roughly 7400 invitations to complete the survey, the R Consortium received more than 1100 responses – a response rate over 14%.

In this blog post, we summarize that feedback and offer next steps that the R Consortium and R Community may take based on this feedback.

Who responded to the survey

Of respondents, 42% are relatively new to R package development with 3 or fewer years of experience, 31% have 4-6 years, and 27% have more than 7 years of experience. As the following table shows, the majority of package authors have been working with R for less than 6 years and writing up to 5 packages.

The largest subgroup of responders (44%) have produced one package over their career. However, 39% of responders have not pushed a package to CRAN over the past year.

The most popular license used among respondents is ‘GPL-3’ at 35% with ‘GPL-2 | GPL-3’ a close second at 34%, ‘GPL-2’ next at 24%, and ‘MIT’ at 21%. However, there are a mix of other licenses cited, including LGPL, BSD, Apache2, and Creative Commons, among others.

What do I want others to be able to do with my package?

When it comes to open source software, there are many ways to think about how software could be used. For example, you may want everyone to be able to freely use your software by its API, but have concerns about what happens if the underlying code is modified – derivative works. On the other hand, you may want to impose licensing requirements on the software that uses your software as well, e.g., software that uses my package must be licensed in the same way as my package. The license choice can significantly affect how and whether a given package can be used in corporate, academic, or government environments.

From the survey, 60% of respondents want other software developers to be able to use their package(s) without imposing license requirements on the software that uses their package (via API), with only 15% disagreeing.

The majority of respondents were neutral as to whether they wanted to ensure that software using their package(s) must apply the same license that they chose, with 29% agreeing and 19% disagreeing.

As expected, respondents want to ensure that derivative works of their package(s) remain open source, with 74% agreeing. However, only 25% agree that derivative works should require the same license as the package used.

How do you choose a license for your package?

In the survey, we asked which factors contribute to the choice of package license. Sixteen percent of respondents indicated license choice defaulted to the license of dependent packages, whether used exclusively through their API or if they borrowed code or header definitions. A sizeable 65% indicate that it is a conscious choice based on their understanding of open source and other license terms. But this is tempered by responses described in the next section regarding comfort level with understanding open source licenses.

The open comments section for this question revealed more details, e.g., some respondents consult websites, blogs, and books for license recommendations, or get advice from package reviewers. Some respondents admit they haven’t thought deeply about the choice of license and don’t understand the differences between licenses since the choices and legalese can be overwhelming. Some use what other respected package authors have used (without necessarily understanding why a given license was chosen for such a package) or as determined by corporate or government dictates or requirements. Yet other respondents indicated making an arbitrary or random choice since R package submission requires that some license must be chosen.

The open comments also highlighted some potential misconceptions, such as if a package author chooses GPL-2 for their package, they are unable to change that to a more permissive license later. The ability to change a license depends on multiple factors, e.g., licenses of dependent packages or lifted code, whether all authors give their consent, etc. Some respondents state they want licensing that enables more users of their code rather than fewer. Others see GPL as a way to ensure commercial usage of their packages occurs fairly. Some respondents choose BSD as it provides most freedom to package users.

Open Source License knowledge

For the R Consortium to understand whether resources should be applied to the problem of licensing, we asked package developers about the level of understanding of open source licenses. While 12% outright stated they do not feel comfortable interpreting or applying open source licenses, 62% find license details and options confusing – even if they understand the basic premise of open source licenses.

Only 23% felt confident in choosing the right open source license(s) for their packages, while about 1% claim to have access to Legal Counsel to guide their choice of open source licenses. Another 1% claim to have sufficient legal background to choose the appropriate licenses(s) for their packages.

While licensing is important when trying to use software in corporate settings, only 24% of respondents consider the license of an R package important in determining whether or not they use it – 35% are neutral and 40% don’t think it’s important.

A majority (56%) of respondents believe that corporate adoption of R technology (engine and packages) is important for the R Community – 36% are neutral while 8% feel corporate adoption is not important. Consistent with this, 56% of respondents feel the R ecosystem should make it easy for corporate use of R – 37% are neutral and 6% disagreeing.

Tools and Guidance

As open source communities and technology continues to evolve, there are more tools available to assist with license choice. For example, code scanning tools exist in other open source communities to identify potential licensing issues. While following the advice of such tools is optional, most if not all developers want to “do the right thing” with respect to licensing. Testament to this is that over 71% of respondents indicated they would welcome the availability of a license scanning tool to flag package license issues – only 3% disagreed.

With the objective to enable package developers to make an informed choice of license, respondents were asked if they would like the R Consortium to provide guidance on open source license choices and implications. Over 89% indicated they would. One respondent put it best “I want whatever is best for making sure the CRAN community thrives in the long-term.” This is the intent of the R Consortium as well.

The R Consortium thanks the respondents to this survey for taking the time to share their experience, concerns, and needs. As a next step, the R Consortium will work with the R Community to provide best practices for good “license hygiene.” If you would like to be part of this activity, please reach out to the R Consortium by responding to this post.

Mar 27

Love0

R Consortium welcomes R-Ladies as a top level project

By John Mertic Announcement, Blog, News, R Consortium Project

In 2016, R-Ladies started their effort for a global expansion, with the help from the R-Consortium. Back then there were only 4 active chapters (San Francisco, Taipei, Twin Cities and London) and the goal was to expand to 5-10 cities within the next year. The enthusiasm within the R community for local R-Ladies chapters far exceeded any possible expectations! As of March 2018, the organization has over 90 chapters and almost 19,000 members.

There are R-Ladies chapters in 45 countries around the globe, with many chapters hosting monthly events.

With this fast growth, it became apparent to the R Consortium ISC that this project needs long term investment for success. The diversified voice of R-Ladies, speaking not only as a group representing gender minorities in tech but also a group attracting new R users, aligned with the R Consortium’s defined Code of Conduct and its desires for building a more diverse and inclusive R community.

R Consortium ISC is pleased to announce and welcome R-Ladies as a top level project. R-Ladies has shown a big commitment within the R community and becoming a top level project will provide them a longer term budget cycle (3 years instead of 1 year) to support their community. R-Ladies will also have a voting seat on the ISC (represented by Gabriela de Queiroz).

We invite all in the R community to congratulate R-Ladies on this milestone, and look forward to ensuring they have the infrastructure and funding to bring more diversity to the R community.

Feb 22

Love0

Announcing the second round of ISC Funded Projects for 2017

By John Mertic Announcement, Blog, News, R Consortium Project

The R Consortium ISC is pleased to announce that the projects listed below were funded under the 2017 edition of the ISC Funded Projects program. This program, which provides financial support for projects that enhance the infrastructure of the R ecosystem or which benefit large segments of the R Community, has awarded $500,000 USD in grants to date. The Spring 2018 call for proposals is now open and will continue to accept proposals until midnight PST on April 1, 2018. Learn more about the program and how to apply for funding for your project.

Quantities for R

Proposed by Iñaki Ucar

The ‘units’ package has become the reference for quantity calculus in R, with a wide and welcoming response from the R community. Along the same lines, the ‘errors’ package integrates and automatises error propagation and printing for R vectors. A significant fraction of R users, both practitioners and researchers, use R to analyse measurements, and would benefit from a joint processing of quantity values with errors.

This project not only aims at orchestrating units and errors in a new data type, but will also extend the existing frameworks (compatibility with base R as well as other frameworks such as the tidyverse) and standardise how to import/export data with units and errors.

Refactoring and updating the SWIG R module

Proposed by Richard Beare

The Simplified Wrapper and Interface Generator (SWIG) is a tool for automatically generating interface code between interpreters, including R, and a C or C++ library. The R module needs to be updated to support modern developments in R and the rest of SWIG. This project aims to make the R module conform to the recommended SWIG standards and thus ensure that there is support for R in the future. We hope that this project will be the first step in allowing SWIG generated R code using reference classes.

Future Minimal API: Specification with Backend Conformance Test Suite

Proposed by Henrik Bengtsson

The objective of the Future Framework implemented in the future package is to simplify how parallel and distributed processing is conducted in R. This project aims to provide a formal Future API specification and provide a test framework for validating the conformance of existing (e.g. future.batchtools and future.callr) and to-come third-party parallel backends to the Future framework.

An Earth data processing backend for testing and evaluating stars

Proposed by Edzer Pebesma

The stars project enables the processing Earth imagery data that is held on servers, without the need to download it to local hard drive. This project will (i) create software to run a back-end, (ii) develop scripts and tutorials that explain how such a data server and processing backend can be set up, and (iii) create an instance of such a backend in the AWS cloud that can be used for testing and evaluation purposes.

Jan 31

Love0

R Consortium Call For Proposals: February 2018

By josephrickert Announcement, Blog, News, R Consortium Project

by Joseph Rickert

The first ISC Call for Proposals for 2018 is now open. We are looking for ambitious projects that will contribute to the infrastructure of the R ecosystem and benefit large sections of the R community. However, we are not likely to fund proposals that ask for large initial cash grants. The ISC tends to be conservative with initial grants, preferring projects structured in such a way that significant initial milestones can be achieved with modest amounts of cash.

As with any proposed project, the more detailed and credible the project plan, and the better the track record of the project team, the higher the likelihood of receiving funding. Please be sure that your proposal includes measurable objectives, intermediate milestones, a list of all team members who will contributing work and a detailed accounting of how the grant money will be spent.

But, most importantly – don’t let this talk of large projects dampen your enthusiasm! We are looking for projects with impact, regardless of their size. With this call for proposals, we are hoping to stimulate creativity and help turn good ideas into tangible benefits. Look around your corner of the R Community, what needs doing and how can the R Consortium help?

Please do not submit proposals to sponsor conferences, workshops or meetups. These requests should be sent directly to the R Consortium’s R User Group and Small Conference Support Program.

To submit a proposal for ISC funding, read the Call for Proposals page and submit a self-contained pdf using the online form. You should receive confirmation within 24 hours.

The deadline for submitting a proposal is midnight PST, Sunday April 1, 2018.

Jan 31

Love0

The 2018 R Consortium R User Group Support Program is Underway.

By josephrickert Announcement, Blog, Events, News, R Consortium Project

In just one year, the R Consortium through the R User Group Support program sponsored 76 R user groups and 3 small conferences with cash grants totaling just under $30,000. This program aligns with the R Consortium mission of fostering the continued growth of R community and the data science ecosystem, and has already helped bring more people to using R and contributing to the community.

Coming off a successful 2017, we are pleased to announce the opening of the 2018 program today. While the structure of the 2018 program is similar to last year’s program with the multiple levels of support, we have enhanced the program based on feedback from last year’s funded user groups.

Complimentary Meetup.com Pro Account

After a year of supporting user groups, we’ve found that the primary cost for each group is having a page on meetup.com or thier own website ( though the majority prefer the meetup.com platform ). This leaves less funds available things like meetup space, food, or even swag, and thus put more of a burden on the group leaders to attract people to the group.

This year we’ve leveraged our relationship with the Linux Foundation, and now will provide each user group a complimentary meetup.com Pro account. Leveraging this not removes one less cost concern for group leaders, but it will also better enable us to promote user groups through the many features the platform provides for groups. For all the details of the program, eligibility requirements for the three levels of user group grants, the schedule of grants and the details of signing up for the meetup.com pro account please see the R Consortium’s R User Group Support Program webpage.

Small Conference Support

We’ve also seen an increase in the number of smaller, regional focused R conferences happening around the world. Grassroots events like this are critical for sustainability in the R community, but need financial support and community awareness to be successful. Several reached out last year and we provided funding with excess funds in the program with great results.

These events perfectly align with the mission of the R User Group Support program, we’re formally expanding it this year to provide cash grants in the $500 to $1,000 range to continue to encourage small, R-focused conferences and meetings organized by non-profit or volunteer groups are the world. You can find out more about this new piece of the program on the R Consortium’s R User Group Support Program webpage.

Length of Program

R Consortium will begin taking applications for both R User Group Support and Small Conference Support today. Applications will be accepted through September 30, 2018.

Apply to the 2018 RC RUGS program by filling out this form. You can email us at rugs@r-consortium.org with any questions around the program.

Jan 17

Love0

Recap of rOpenSci’s ozunconf – October 2017 in Melbourne

By John Mertic Blog, Events

The R Consortium was happy to be a sponsor rOpenSci’s ozunconf last October in Melbourne. You can read about the “unconference” on rOpenSci’s blog and follow some of the projects begun at the event here.

Through the RUGS program, R Consortium was honored to be a sponsor for this event. If you have an smaller event you would like support for, stay tuned for the official program announcement in early 2018.

Dec 22

Love0

2018 R Consortium Silver member representatives for Board and ISC

By John Mertic Announcement, Blog, News, R Consortium Project

Per the R Consortium ByLaws and ISC Charter, the Silver Member class is entitled to elect individuals representative of the Silver Member class for a term starting January 1, 2018 through December 31, 2018 as follows:

1 representative to the ISC
1 Silver Member Board Director per every 7 Silver Members, subject to provisions 4.2 and 4.3(d) of the R Consortium ByLaws. This means the Silver Member class can elect up to 2 Board Directors representing the class.

These elections ran during the month of November 2017, with 3 nominees for Silver Member Board Director and 3 nominees for the Silver Member ISC representative.

I am pleased to announce those elected by the Silver member class to serve on the Board of Directors and ISC effective 1/1/2018 through 12/31/2018.

Silver Member ISC representative

Dirk Eddelbuettel – Ketchum Trading

Silver Member Board Directors

Mark Hornick – Oracle
Chester Ismay – DataCamp

Please join me in congratulating each of the elected representatives.

We would also like the share a big thank you to the outgoing Silver Member Board Director Richard Pugh of Mango Solutions. His guidance and leadership within the R Consortium have made a huge impact on its current success.