“A worldwide epidemiological database for COVID-19 at fine-grained spatial resolution” by COVID-19 Data Hub developer, Emanuele Guidotti was published in Scientific Data on the March 28, 2022, and is available to view online at https://www.nature.com/articles/s41597-022-01245-1 (DOI 10.1038/s41597-022-01245-1).
The R Consortium is proud to be a sponsor of the COVID-19 Data Hub. We believe that the need for accessible, organized, official COVID-19 case data will persist for some time into the future, and that the COVID-19 Data Hub is a serious contribution to science and public health.
— Joseph B. Rickert, Chair R Consortium Board of Directors
This database provides the daily time-series of COVID-19 cases, deaths, recovered people, tests, vaccinations, and hospitalizations, for more than 230 countries, 760 regions, and 12,000 lower-level administrative divisions. The geographical entities are associated with identifiers to match with hydrometeorological, geospatial, and mobility data. The database includes policy measures at the national and, when available, sub-national levels. The data acquisition pipeline is open-source and fully automated. As most governments revise the data retrospectively, the database always updates the complete time-series to mirror the original source. Vintage data, immutable snapshots of the data taken each day, are provided to ensure research reproducibility. The latest data are updated on an hourly basis, and the vintage data are available since April 14, 2020. All the data are available in CSV files or SQLite format. By unifying the access to the data, this work makes it possible to study the pandemic on a global scale with high resolution, taking into account within-country variations, nonpharmaceutical interventions, and environmental and exogenous variables.