CII Best Practices Badge for R Packages – responding to concerns

Our last post Should R Consortium Recommend CII Best Practices Badge for R Packages: Latest Survey Results summarized results from the CII Best Practices survey conducted this summer. A goal of the CII Best Practices program is to help improve open source software quality. Respondents shared several concerns to which David Wheeler, project lead for the Core Infrastructure Initiative (CII) with the Linux Foundation, and I wanted to respond.

Let’s dive in…

Concern #1: Does the CII Badge have the “correct” or “best” set of criteria?

The CII Badge criteria are the best general-purpose OSS project criteria that we, the OSS community, have developed to date. The CII Badge criteria were developed based on the experience and recommendations of many experts, previous criteria developed by various organizations, and the examination of real-world successful OSS projects. No doubt the criteria could be improved further, but the badge criteria are themselves open source and can be improved using the same process as any OSS code: simply propose changes for review!

Concern #2: Achieving a badge does not necessarily mean a given package is well-designed or well-implemented.

Many of the CII criteria can help push projects towards creating better- or well-designed and implemented packages. In the “passing” level, the CII criteria include these requirements:

[warnings] criterion requires enabling compiler warning flags or similar
[static_analysis] requires the use of at least one static code analysis tool (assuming one exists)
[test] requires a test suite, which often nudges people towards better design and implementation
[test_policy] requires that you keep adding tests, especially as new functionality comes online
[know_secure_design] requires at least one primary developer know how to design secure software. The best practices site explains what this means under the “details” of this criterion. In summary, this criterion requires that at least one primary developer understands the 8 principles of Saltzer and Schroeder (as explained by the CII Best Practices site) and also knowing to (1) limit the attack surface and (2) perform input validation with whitelists. Software can be badly designed by knowledgeable people, but software is much more likely to be designed and implemented well if developers know the basics.
[know_common_errors] requires that at least one of the project’s primary developers must know of common kinds of errors that lead to vulnerabilities and at least one method to counter or mitigate each of them

Higher badge levels (“silver” and “gold”) offer even more.

It’s true that a badge doesn’t guarantee that a package is well-designed or implemented by some measure, but part of the problem is that it’s difficult to unambiguously determine if something is well-designed or well-implemented. Much depends on the purpose of the package! So instead, many criteria focus on enabling mass peer review and managing improvements, so that problems are more likely to be detected and corrected.

In short, software normally undergoes change over time. Instead of requiring that a project be perfect at one point in time, we focus on criteria that will help projects continuously improve over the long run.

Concern #3: How does the CII help to ensure the validity of self-certification, e.g., through automated tools?

We use automated tools and reject some answers that are clearly false. We require that replies be public and that there be URLs for some answers; that makes it easy for anyone to check answers. In the worst case, we can override false answers, though in practice we’ve almost never found that necessary.

Concern #4: Even if every R package had a badge, the issue of finding a needed package among over 12K packages remains.

Finding a desired package or the “best” one for a given task is largely orthogonal to improving package quality, though the two can be related. The badging process can help, because one of the criteria is “The project website MUST succinctly describe what the software does (what problem does it solve?)”. Search engines are much more effective at finding relevant packages once that kind of information is available. In another way, if using packages that state adherence to the CII criteria is important to you or your organization, the search space may be significantly reduced – at least as a starting point.

Concern #5: Can the CII criteria be streamlined to reflect only the needs of R packages, including those that are more data and documentation than code?

Our current primary approach for streamlining is to automate criteria. That said, if you have a specific idea for streamlining things further, please file an issue on GitHub here.

Concern #6: Will automated tools be available for performing at least parts of the assessment, e.g., as found in R’s devtools?

We already use automated tools to assist in completing the form. We’d rather not require people to install tools to fill in information, because that would be a barrier for some. If there are tools we aren’t using and should use, let us know!

Concern #7: A badge program could penalize developers who do not have time, money, or skills to meet the criteria, making their packages less desirable if they do not achieve a badge.

We’ve worked hard to make the badge “passing” criteria doable for single-person projects. Daniel Stenberg is the author and maintainer of cURL and libcurl, and he’s been especially influential in ensuring that the “passing” badge is doable for single-person projects. If you have no tests, cannot automatically build your software (even though it requires building), or have never run a static analysis tool of any kind, then there is some work… but it’s better for users if these are addressed.

The top “gold” level requires multiple people in a project, e.g., because the project MUST have a “bus factor” of 2 or more. That can be a challenge for developers, but it’s a big advantage for users – users would much rather depend on software where a single death doesn’t suddenly mean that there’s no one to update the software. No one is required to get the gold level, however, and there are many ways to resolve this.

Concern #8: Introducing more process comes with additional burdens for package developers, perhaps reducing overall ecosystem participation.

We’ve done our best to minimize the risk from additional burdens. We automate some answers, and that helps. We reduce the risk of duplicated evaluation processes by having a single set of criteria for all OSS. Perhaps most importantly: the criteria were developed by examining real-world successful projects, so they require actions that other projects are already doing and finding helpful.

Perhaps more importantly, keep in mind that getting a CII best practices badge is optional – a package author can decide if the benefits of adhering to the CII criteria outweigh the costs.

Concern #9: Is there a way to distinguish tests for validating statistical software numerical computations and statistical properties?

Sure. Naming conventions for tests are a common way to distinguish types of tests; you can also put different kinds of tests in different directories. From the badge perspective, we don’t focus on that distinction. For “passing” the key is that your project must have a general policy that as major new functionality is added to the software produced by the project, tests of that functionality should be added to an automated test suite. Passing doesn’t require a perfect test suite; instead, we require that you have a test suite and that you’re committed to improving it. Since OSS is visible to the user community, a potential user may want to examine the type and quality of tests performed. The higher-level badges do require better test suites, as you might expect.

We continue to receive valuable comments through the survey and are pleased to report that more R package authors are choosing to participate in the CII as evidenced by the surge in new R CII project entries.