Introduction

The core idea behind Open Government Data (OGD) is a simple one: public data should be a shared resource. It is valuable not only to the government departments that collect it, but also has value for citizens, entrepreneurs and other parts of the public sector.

However, moving from the idea of OGD, to its implementation, takes dedicated and sustained policy attention. Achieving widespread impact from OGD release relies upon not only on the supply of high-quality data, but also upon the capacity of users to work with data, and the ability of government to engage proactively with, and respond to, those users.

In our complex world, securing government accountability, coordinating action to improve society, and bootstrapping new business ideas can all benefit from access to government data. Yet far too often, access to data, and the skills to use it, are unequally distributed, and there are unnecessary technical and legal restrictions that prevent data re-use. Calls for a data revolution are placing renewed attention on ensuring the collection and management of high quality data around the world through strengthened statistical capacity, and are driving a focus on the use of new ‘big data’ resources in policy making. Against this backdrop questions concerning who has access to data, and whether citizens have the capability and freedoms to create, access and analyse data about their own communities and concerns, are ever more important if we are to secure a fair balance of power in our societies.

The Open Data Barometer

This report brings together the results of expert survey research, technical assessments of data supply, and secondary data, in order to contribute to deeper understandings of the global landscape of open data. Specifically, the report scores countries on:

Readiness to secure benefits from open data: including the legal, political, economic, social, organisational, and technical foundations that can support the supply and use of open data.
Implementation of open data practice, measured through the availability of data across 15 key categories, and the adoption for those datasets of the common practices set out in the Open Definition, and Open Government Data Principles.
Impacts of open data, measured through media and academic mentions of data use and impact.

This second edition of the Open Data Barometer replicates the core methodology used in the 2013 edition of the report, while drawing on updated research inputs covering the 2013-2014 period, and adding nine new countries to the sample. The methodological annex describes minor adjustments between the first and second editions. Repeating the methodology used in the first edition of the ODB allows for comparisons to be made between the 2013 and 2014 data, and supports both an assessment of global and local trends, as well as the development of key learnings to improve future open data measurement activities. As the open data field — and with it the Open Data Barometer — continues to develop in future years, we will increasingly draw upon the common assessment framework for open data, developed by the Web Foundation, the GovLab, and other partners, and will place greater emphasis on evidence of open data impact and use (as an important mediating variable between readiness, and data availability and impact).

The following sections of this report present selected statistics and commentary based on our data collection, as well as offering a composite ranking of countries. However, this report is just one part of the Open Data Barometer. By providing the underlying data gathered during the project we encourage other advocates, scholars and practitioners to draw upon it to ask further research questions, and to refine shared understanding of how to achieve positive impacts from open data.

Defining open data

The last year has witnessed growing concern, and confusion, about the boundaries between personal or private data, and open data. Public trust in government data handling has been undermined as citizens as citizens have grown more aware of the ways in which surveillance agencies and corporations have abused their personal data, or have seen mistakes made by government in publishing inappropriately anonymised data¹, trust in government data handling has been undermined. Meanwhile, as governments have sought to make better use of the records they hold on individual citizens, or to engage with big data, they have often clouded the distinction between ‘data sharing’ (where there can still be restrictions on who can use the data, and what for), and ‘open data’, which should be accessible for anyone to re-use for any purpose. It is important therefore to draw clear definitions and distinctions.

When we discuss open data in this report, we are discussing data that is:

Accessible - Proactively published, and available free of charge.
Machine-readable - Published in file formats and structures that allow computers to extract and process the data for easy sorting, filtering and content searching.
Re-usable - Available under legal regimes or explicit terms that place a minimum of restrictions on how the data may be used; at most, the publisher can specify how the source should be acknowledged.

These principles are conventionally operationalised by checking whether data is online, in specified file formats, and provided with explicit license terms. In assessing whether datasets qualify as “open data” we follow this approach, but we also collect other important variables about the timeliness, sustainability, and discoverability of datasets, recognising that there are important social, technical, and legal aspects of openness.

Private data and public records

By definition, open data should not include private data. Private data should have a limited distribution; any restrictions on distribution go against the re-usability terms of open data. In general, this means that the records government holds on individuals should not be made available as open data unless these records are understood to be part of the public record. For example, the names of company directors may be part of the public record, and so could be released as open data. Providing public records as open data, including records that contain information about individuals, does not invalidate other obligations on potential users of the data to abide by existing legal frameworks for data protection. This highlights the importance of linking open data regulations and laws designed to increase transparency, with privacy protection laws and frameworks that can restrict certain abusive uses of the data. Even with these frameworks in place, there are some datasets where the risk of the data being re-identified, or personal information contained within it abused, is such that it cannot be ‘open by default’.

The Open Data Barometer explicitly surveys the existence of data protection laws in each country, and considers their existence and strength as a component of open data readiness.

Key facts: methodology

The Open Data Barometer is based upon three kinds of data:

A peer reviewed expert survey carried out between May and September 2014, which asked researchers to provide a score from 0–10 in response to a range of questions about open data contexts, policy, implementation and impacts. Scores were normalised (using z-scores) prior to inclusion in the Barometer.
Detailed dataset survey completed by a team of technical experts. These assessments were based on a 10-point checklist, completed for 15 kinds of data in each country, which touched on issues of data availability, format, license, timeliness and discoverability. Initial source information for locating datasets, and the agencies responsible for their production, were provided by the expert survey, and then validated and expanded upon by the technical experts. Validation was carried out between August and October 2014, and incorporates evidence up until the end of October 2014. Each answer in the 10-point checklist is supported by qualitative information and detailed hyperlinks. Checklist responses are combined in a weighted aggregation to provide a 0–100 score for each dataset. These are presented in their original form to allow comparison between datasets, and are averaged to give a dataset implementation sub-index. This sub-index is normalised (using z-scores) prior to inclusion in the overall Barometer calculations.
Secondary data selected to complement our expert survey data. This is used in the readiness section of the Barometer, and is taken from the World Economic Forum, United Nations e-Government Survey and Freedom House. The data is normalised (using z-scores) prior to inclusion in the Barometer.

The list of countries included in the 2014 Barometer is based upon the Web Index sample, which was designed to represent a broad range of different regions, political systems and levels of development, and as such there should be no selection bias in the sample towards countries with OGD policies.

You can read more about the detailed research process in the methodology section.

Footnotes

For example, New York provided GPS logs of taxi journeys in response to a Freedom of Information Law request, but failed to adequately anonymise the data allowing the journeys and identities of drivers to be extracted from the data. ↩