More UK Open Data Moves - and Why That Makes Sense

In striking contrast with its disappointing performance in terms of supporting open source, the UK government continues to take huge strides in the world of open data. Details about its latest moves are contained in this document [.pdf] that...


In striking contrast with its disappointing performance in terms of supporting open source, the UK government continues to take huge strides in the world of open data. Details about its latest moves are contained in this document [.pdf] that came out of the recent 2011 Autumn Statement:

Here's the Overview:

The internet has evolved to change the way we live, work and manage business processes within increasingly global marketplaces. Open Data is the next phase of this ICT revolution, enabling new systems, processes, products and markets to emerge, and supporting a whole raft of complementary innovations across the economy. The potential prize is considerable. A recent report estimated the current total direct and indirect economic value of public sector information at ‚¬140 billion per year for the EU27 (Vickery/ EU Commission, 2011)1. This suggests that similar information in the UK is already worth in the region of £16 billion a year.

The document then lists the various data sets that will be released. It's an impressive rollcall of key databases, including some from healthcare, prescription, transport, Land Registry, Companies House, and a welcome release of core data from the Met Office. If you find the details in the government document a bit hard to digest, Leigh Dodds has put together an excellent online spreadsheet listing what datasets are due to be released when.

That's all great news, as is the creation of a new Open Data Institute, also announced in the Autumn Statement:

Establish a world-leading Open Data Institute (ODI) to innovate, exploit and research the opportunities for the UK created by the Government's Open Data policy.

Co-directed by Professor Sir Tim Berners-Lee and Professor Nigel Shadbolt and involving business and academic institutions, the ODI will be based in Shoreditch in East London and focus on (i) business innovation (ii) commercialisation (iii) developing Web standards to support the Open Data Agenda (iv) world leading research (v) a UK national training centre, and (vi) providing expert advice for Government.

In view of the commercial and social priority of Open Data, the Government is to commit up to £10m over five years with match funding from industry and academic centres to support the Open Data Institute through the Technology Strategy Board. The implementation plan will be published by April 2012.

These actions are predicated on the belief that opening up data in this way can produce all kinds of economic benefits (a view I naturally share.) There's already some research work that supports that view, as does this new Australian study [.pdf]. It's called "Costs and Benefits of Data Provision", and looks closely at the economic rationale for freely releasing public sector information (PSI):

It is clear from the case studies presented that even the subset of benefits that can be measured outweigh the costs of making PSI more freely and openly available. It is also clear that it is not simply about access prices, but also about the transaction costs involved. Standardised and unrestrictive licensing, such as Creative Commons, and data standards are crucial in enabling access that is truly open (i.e. free, immediate and unrestricted).

That last point about licensing and data standards is important. The report makes this very interesting observation about the role of the government in the open data economy:

Information has public good characteristics (i.e. being non-rivalrous and non-excludable), as one person's consumption of a piece of information does not prevent others from consuming it and it is difficult to prevent information spreading to others. While information can be made more or less excludable through intellectual property rights, such as copyright, it is still difficult to stop people sharing information. In general, the private sector will tend to under-produce such goods as it is difficult to realise the full value. It is this that justifies public sector supply of information (Nilsen 2007; 2009).1 Indeed, Stiglitz et al. (2000) concluded that the theoretical underpinnings of the private versus public trade-off shifts as the economy moves toward a digital one, with a larger public role in the digital economy.

It summarises the benefits of releasing public data as follows:

the direct private cost-benefits of new users and uses, and more intensive use by existing users; the privately captured spillover cost-benefits of additional use for other businesses and agencies; and the public spillover cost- benefits of additional use. These can arise in the form of new activities, businesses and industries (e.g. Weather derivatives, Geospatial Services, etc.); increased efficiency of existing activities, businesses and industries (e.g. optimisation of crop planting and harvesting, mining exploration and extraction, etc.); and public good aspects (e.g. cheaper food, enhanced safety while travelling, etc.). Such impacts are very difficult to measure.

Finally, it has a very interesting section devoted to the release of publicly-funded research, including this comprehensive list of the benefits there:

Open access to, and sharing of, data from publicly funded research offer many research and educational advantages over a closed, proprietary system that places high barriers to both access and subsequent re-use. Open access to such data:

€¢ Reinforces open scientific and scholarly inquiry;
€¢ Encourages diversity of analysis and opinion;
€¢ Promotes new research and new types of research;
€¢ Enables the application of automated knowledge discovery tools online;
€¢ Allows the verification of previous results;
€¢ Makes possible the testing of new or alternative hypotheses and methods of analysis;
€¢ Establishes a broader base set of data than any one researcher can hope to collect, thereby providing a greater baseline of factual information for the research community;
€¢ Supports studies on data collection methods and measurement;
€¢ Facilitates the education of new researchers;
€¢ Enables the exploration of topics not envisioned by the initial investigators;
€¢ Permits the creation of new data sets, information, and knowledge when data from multiple sources are combined;
€¢ Helps transfer factual information to, and promote development and capacity building in
developing countries;
€¢ Promotes interdisciplinary, inter-sectoral, inter-institutional, and international research; and
€¢ Helps to maximize the research potential of new digital technologies and networks, thereby providing greater returns from the public investment in data collection and research.

The report concludes:

Studies of the costs and benefits associated with open access to publications and data arising from publicly funded research suggest that the benefits can be significant and the costs relatively small. The return on investment in curation and open accessibility is all the greater when the data are long-lived, as the returns are recurring during the useful life of the data, although the time lag between investment and return (cost and benefit) can be substantial. What this study demonstrates is that the direct and measurable benefits of making PSI available freely and unrestrictedly typically outweigh the costs. When one adds the longer-term benefits that we cannot fully measure, cannot even foresee, the case for open access appears to be strong.

All-in-all, this is a valuable addition to the literature in this area, and confirms the growing momentum around the world for the free release of public data – and the wisdom of the UK government's latest moves.

Follow me @glynmoody on Twitter or, and on Google+