The Open Data Barometer draws on over 14,000 different data points, captured as quantifiable data and backed by qualitative source information.
The data is made available under a Creative Commons Attribution 4.0 License, and we encourage you to explore, re-use and remix the data.
Please cite any uses of the data as: World Wide Web Foundation, Open Data Barometer Global Report (Second Edition), 2015 and include a link to http://www.opendatabarometer.org.
Details of the questions addressed by researchers, the scoring thresholds applied during research and review, and information on the research process can be found in the Web Index and Technical Survey research handbooks.
For comparison, updated 2013 datasets have also been prepared using the same variable names, and incorporating 2-digit ISO codes, as some country labels have changed between years due to the Web Index production process:
Labels and details of each of the variables in the Rankings and Survey files are provided in:
In addition, for this second edition, we are providing the main qualitative source information provided by researchers. This information was collected in order to justify and validate the quantitative scores given, and is not designed to provide a comprehensive review in response to each question.
We are continuing to explore ways to improve the provision of qualitative data alongside the Open Data Barometer, but hope this year’s initial release is a useful resource for other researchers.
A number of the analysis scripts (R) used in compiling the Open Data Barometer are available on GitHub.
The datasets survey employed conditional logic which meant that some justification fields were hidden depending on the state of the related question. However, if a question answer changed, the hidden data was not deleted, and so some data contained in these fields may not represent the final set of judgements made about a dataset.
The overall data scores are also based on a conditional logic, designed to score countries on the basis of machine-readable data. Researchers were encouraged, however, when machine-readable data was not available, to still complete questions with respect to the best data they could locate. For this reason, simple summation of scores, or simple searching of fields on the assumption that their values represent properties of machine-readable data, is not possible. Instead, fields will need to be filtered on the basis of the values in ODB-2014-Datasets-Scored.csv if looking to understand the justifications for a given dataset score.
Due to the design of the survey platform, and the way it was used by some researchers and reviewers, in some cases additional justification information was captured through comments between researchers and reviewers, rather than in the public justifications field. This is an issue which will be addressed in future versions of the platform, but, due to the resources required to filter and extract these comments into the public justification field in this edition of the survey, we cannot guarantee that the justification texts represent the full data on which scores were assigned in all cases.