All Posts By

John Mertic

Package Licensing: Would the R Community like some help? Feedback from the trenches

By | Blog, R Consortium Project

Editor’s Note: This post comes from Mark Hornick, who leads the Code Coverage WG and serves on the Board of Directors

In the Fall of 2017, the R Consortium surveyed the R Community to understand opportunities, concerns, and issues facing the community. Taking into account that feedback, the R Consortium recently surveyed package authors and maintainers on a number of topics surrounding R package licensing. Questions revolved around motivations for choice of license, comfort level in understanding license meaning and implications, importance of corporate adoption of R, and whether guidance on licensing from the R Consortium would be valuable.

While there are a significant number of people in the R community who respond they understand and intentionally choose the license(s) they apply to their package software, a much larger group are unclear about which license to choose and what the implications of that choice are. These implications affect not only the individual package, but the R Community and corporate, government, and academic users of those packages as well.  Of roughly 7400 invitations to complete the survey, the R Consortium received more than 1100 responses – a response rate over 14%.

In this blog post, we summarize that feedback and offer next steps that the R Consortium and R Community may take based on this feedback.

Who responded to the survey

Of respondents, 42% are relatively new to R package development with 3 or fewer years of experience, 31% have 4-6 years, and 27% have more than 7 years of experience. As the following table shows, the majority of package authors have been working with R for less than 6 years and writing up to 5 packages.

The largest subgroup of responders (44%) have produced one package over their career. However, 39% of responders have not pushed a package to CRAN over the past year.

The most popular license used among respondents is ‘GPL-3’ at 35% with ‘GPL-2 | GPL-3’ a close second at 34%, ‘GPL-2’ next at 24%, and ‘MIT’ at 21%. However, there are a mix of other licenses cited, including LGPL, BSD, Apache2, and Creative Commons, among others.

What do I want others to be able to do with my package?

When it comes to open source software, there are many ways to think about how software could be used. For example, you may want everyone to be able to freely use your software by its API, but have concerns about what happens if the underlying code is modified – derivative works. On the other hand, you may want to impose licensing requirements on the software that uses your software as well, e.g., software that uses my package must be licensed in the same way as my package. The license choice can significantly affect how and whether a given package can be used in corporate, academic, or government environments.

From the survey, 60% of respondents want other software developers to be able to use their package(s) without imposing license requirements on the software that uses their package (via API), with only 15% disagreeing.

The majority of respondents were neutral as to whether they wanted to ensure that software using their package(s) must apply the same license that they chose, with 29% agreeing and 19% disagreeing.

As expected, respondents want to ensure that derivative works of their package(s) remain open source, with 74% agreeing. However, only 25% agree that derivative works should require the same license as the package used.

How do you choose a license for your package?

In the survey, we asked which factors contribute to the choice of package license.  Sixteen percent of respondents indicated license choice defaulted to the license of dependent packages, whether used exclusively through their API or if they borrowed code or header definitions. A sizeable 65% indicate that it is a conscious choice based on their understanding of open source and other license terms. But this is tempered by responses described in the next section regarding comfort level with understanding open source licenses.

The open comments section for this question revealed more details, e.g., some respondents consult websites, blogs, and books for license recommendations, or get advice from package reviewers. Some respondents admit they haven’t thought deeply about the choice of license and don’t understand the differences between licenses since the choices and legalese can be overwhelming. Some use what other respected package authors have used (without necessarily understanding why a given license was chosen for such a package) or as determined by corporate or government dictates or requirements. Yet other respondents indicated making an arbitrary or random choice since R package submission requires that some license must be chosen.

The open comments also highlighted some potential misconceptions, such as if a package author chooses GPL-2 for their package, they are unable to change that to a more permissive license later. The ability to change a license depends on multiple factors, e.g., licenses of dependent packages or lifted code, whether all authors give their consent, etc. Some respondents state they want licensing that enables more users of their code rather than fewer. Others see GPL as a way to ensure commercial usage of their packages occurs fairly. Some respondents choose BSD as it provides most freedom to package users.

 

Open Source License knowledge

For the R Consortium to understand whether resources should be applied to the problem of licensing, we asked package developers about the level of understanding of open source licenses. While 12% outright stated they do not feel comfortable interpreting or applying open source licenses, 62% find license details and options confusing – even if they understand the basic premise of open source licenses.

Only 23% felt confident in choosing the right open source license(s) for their packages, while about 1% claim to have access to Legal Counsel to guide their choice of open source licenses. Another 1% claim to have sufficient legal background to choose the appropriate licenses(s) for their packages.

While licensing is important when trying to use software in corporate settings, only 24% of respondents consider the license of an R package important in determining whether or not they use it – 35% are neutral and 40% don’t think it’s important.

A majority (56%) of respondents believe that corporate adoption of R technology (engine and packages) is important for the R Community – 36% are neutral while 8% feel corporate adoption is not important. Consistent with this, 56% of respondents feel the R ecosystem should make it easy for corporate use of R – 37% are neutral and 6% disagreeing.

Tools and Guidance

As open source communities and technology continues to evolve, there are more tools available to assist with license choice. For example, code scanning tools exist in other open source communities to identify potential licensing issues. While following the advice of such tools is optional, most if not all developers want to “do the right thing” with respect to licensing. Testament to this is that over 71% of respondents indicated they would welcome the availability of a license scanning tool to flag package license issues – only 3% disagreed.

With the objective to enable package developers to make an informed choice of license, respondents were asked if they would like the R Consortium to provide guidance on open source license choices and implications. Over 89% indicated they would. One respondent put it best “I want whatever is best for making sure the CRAN community thrives in the long-term.” This is the intent of the R Consortium as well.

The R Consortium thanks the respondents to this survey for taking the time to share their experience, concerns, and needs. As a next step, the R Consortium will work with the R Community to provide best practices for good “license hygiene.” If you would like to be part of this activity, please reach out to the R Consortium by responding to this post.

R Consortium welcomes R-Ladies as a top level project

By | Announcement, Blog, News, R Consortium Project

In 2016, R-Ladies started their effort for a global expansion, with the help from the R-Consortium. Back then there were only 4 active chapters (San Francisco, Taipei, Twin Cities  and London) and the goal was to expand to 5-10 cities within the next year. The enthusiasm within the R community for local R-Ladies chapters far exceeded any possible expectations! As of March 2018, the organization has over 90 chapters and almost 19,000 members.

There are R-Ladies chapters in 45 countries around the globe, with many chapters hosting monthly events.  

With this fast growth, it became apparent to the R Consortium ISC that this project needs long term investment for success. The diversified voice of R-Ladies, speaking not only as a group representing gender minorities in tech but also a group attracting new R users, aligned with the R Consortium’s defined Code of Conduct and its desires for building a more diverse and inclusive R community.

R Consortium ISC is pleased to announce and welcome R-Ladies as a top level project. R-Ladies has shown a big commitment within the R community and becoming a top level project will provide them a longer term budget cycle (3 years instead of 1 year) to support their community. R-Ladies will also have a voting seat on the ISC (represented by Gabriela de Queiroz).

We invite all in the R community to congratulate R-Ladies on this milestone, and look forward to ensuring they have the infrastructure and funding to bring more diversity to the R community.

Announcing the second round of ISC Funded Projects for 2017

By | Announcement, Blog, News, R Consortium Project

The R Consortium ISC is pleased to announce that the projects listed below were funded under the 2017 edition of the ISC Funded Projects program. This program, which provides financial support for projects that enhance the infrastructure of the R ecosystem or which benefit large segments of the R Community, has awarded $500,000 USD in grants to date. The Spring 2018 call for proposals is now open and will continue to accept proposals until midnight PST on April 1, 2018.  Learn more about the program and how to apply for funding for your project.

Quantities for R

Proposed by Iñaki Ucar

The ‘units’ package has become the reference for quantity calculus in R, with a wide and welcoming response from the R community. Along the same lines, the ‘errors’ package integrates and automatises error propagation and printing for R vectors. A significant fraction of R users, both practitioners and researchers, use R to analyse measurements, and would benefit from a joint processing of quantity values with errors.

This project not only aims at orchestrating units and errors in a new data type, but will also extend the existing frameworks (compatibility with base R as well as other frameworks such as the tidyverse) and standardise how to import/export data with units and errors.

Refactoring and updating the SWIG R module

Proposed by Richard Beare

The Simplified Wrapper and Interface Generator (SWIG) is a tool for automatically generating interface code between interpreters, including R, and a C or C++ library. The R module needs to be updated to support modern developments in R and the rest of SWIG. This project aims to make the R module conform to the recommended SWIG standards and thus ensure that there is support for R in the future. We hope that this project will be the first step in allowing SWIG generated R code using reference classes.

Future Minimal API: Specification with Backend Conformance Test Suite

Proposed by Henrik Bengtsson

The objective of the Future Framework implemented in the future package is to simplify how parallel and distributed processing is conducted in R. This project aims to provide a formal Future API specification and provide a test framework for validating the conformance of existing (e.g. future.batchtools and future.callr) and to-come third-party parallel backends to the Future framework.

An Earth data processing backend for testing and evaluating stars

Proposed by Edzer Pebesma

The stars project enables the processing Earth imagery data that is held on servers, without the need to download it to local hard drive. This project will (i) create software to run a back-end, (ii) develop scripts and tutorials that explain how such a data server and processing backend can be set up, and (iii) create an instance of such a backend in the AWS cloud that can be used for testing and evaluation purposes.

Recap of rOpenSci’s ozunconf – October 2017 in Melbourne

By | Blog, Events

The R Consortium was happy to be a sponsor rOpenSci’s ozunconf last October in Melbourne. You can read about the “unconference” on rOpenSci’s blog and follow some of the projects begun at the event here.

Through the RUGS program, R Consortium was honored to be a sponsor for this event. If you have an smaller event you would like support for, stay tuned for the official program announcement in early 2018.

2018 R Consortium Silver member representatives for Board and ISC

By | Announcement, Blog, News, R Consortium Project

Per the R Consortium ByLaws and ISC Charter, the Silver Member class is entitled to elect individuals representative of the Silver Member class for a term starting January 1, 2018 through December 31, 2018 as follows:

  • 1 representative to the ISC
  • 1 Silver Member Board Director per every 7 Silver Members, subject to provisions 4.2 and 4.3(d) of the R Consortium ByLaws. This means the Silver Member class can elect up to 2 Board Directors representing the class.

These elections ran during the month of November 2017, with 3 nominees for Silver Member Board Director and 3 nominees for the Silver Member ISC representative.

I am pleased to announce those elected by the Silver member class to serve on the Board of Directors and ISC effective 1/1/2018 through 12/31/2018.

Silver Member ISC representative

Silver Member Board Directors
Please join me in congratulating each of the elected representatives.
We would also like the share a big thank you to the outgoing Silver Member Board Director Richard Pugh of Mango Solutions. His guidance and leadership within the R Consortium have made a huge impact on its current success.