The progress made by the open data movement is pretty extraordinary. A few years ago, data was something that only statisticians cared about, but today it is one of the most vibrant areas of exploration and innovation. I think that's in part because of open source's example of how opening things up allows people to experiment and make progress faster than keeping everything locked down.
But have we reached the stage where we can assert a right to open public data? That's the central claim of an interesting new report from Policy Exchange, entitled "A Right to Data: Fulfilling the promise of open public data in the UK." Here's part of the summary:
A piece of data or content is open if anyone is free to access, use, reuse and redistribute it – subject only, at most, to the requirement to attribute and share-alike.
The business of government has always involved quantities of data. For centuries almost all of this public data has been closely guarded by the state. Governments provided access to data on a need-to-know basis and, for the most part, citizens didn't need to know.
The balance is, however, starting to shift. Advances in information and communication technologies mean that, for the first time in human history, it is technologically feasible for every citizen to have access to every piece of data or content generated by their government. And as this same technology drives fundamental changes in our economy and society, it is becoming clear that, with the right protections, opening up public data will deliver considerable benefits.
In this research note we review the state of open data policy in the UK. We find that the direct cost to the Exchequer of giving away key datasets like maps and postcode data may be far lower than sometimes thought – perhaps in the region of £50 million a year. The potential benefits are notoriously difficult to quantify, but are likely to be orders of magnitude greater.
It makes three recommendations: that the UK government should enshrine a right to public data in legislation; that responsibility for open data rest at board level in every public sector body; and that every public sector body define its public task and associated data requirements.
The report is refreshingly short – just 28 pages – which makes it easy to digest. Along the way, it has some thoughtful analysis of the underlying dynamics of the open data world. Here, for instance, is a good summary of the economics of abundant public data:
Data has many interesting economic characteristics. Two factors are especially important for thisdiscussion. First, public data typically involves high fixed costs – so high, in fact, that it would not normally make sense for multiple bodies to collect or create the same data. Second, digital data can typically be duplicated at very low cost – so low, in fact, that in many cases zero is a good approximation for the marginal cost of data provision.
Taken together, these two factors have significant implications for any government that is considering reselling public data.
The cost structure described above implies that public data provision is a natural monopoly. So if the state seeks to optimise the direct financial return from data provision, it will set a price for users high enough to (a) recover all of its costs and (b) maximise supernormal profits. This may be very profitable indeed for the public bodies concerned. But it imposes a deadweight loss on the economy, as too many users are priced out of the market.
From a public policy perspective, a superior approach is to set price equal to marginal cost. This eliminates deadweight loss and ensures allocative efficiency. The problem, of course, is that in our case it would also leave the data provider running at a loss. In these circumstances there are two broad ways through. The first is for the state to cover the fixed costs of data production, funded out of general taxation and accepted, like other fixed costs in the public sector, as part of the general financing requirement for executing the public task. The second is to find a way to levy a minimally-distortive charge on data users (often done through average-cost pricing, but preferably by implementing some form of Ramsey pricing to focus charges on inelastic activities).
Similarly, this is a good explanation of why opening up data is the best way of improving both government departments and businesses:
Many of the applications of public data will deliver incremental improvements to existing business processes, products and services. Some applications will be truly disruptive, where the firm in question brings an entirely new proposition to market. Successful, radical innovations may be one in 100 (or maybe even fewer). The best way to promote them is to roll the dice as many times as possible. In any economy where information is increasingly important, free public data is animportant precondition for this to happen.
As for the main recommendation, it's good to see the report come out in favour of truly open data – no restrictions on commercial use, but allowing for attribution and share-alike licences (that is, those closest to the GNU GPL):
This should require that all non-personal data collected or created to support the day-to-day business of government be made open: easy to access and free at the point of delivery, without restriction on use or reuse. Important protections for personal data, national security and Ministerial advice should be incorporated into this legislation, to provide clarity on where open data ends.
That's an important point, since the boundaries of open data are crucially important, and need to be agreed from the start, rather than defined on an ad-hoc basis, which will inevitably lead to hoarding.
The report also calls for commercial activities based on top of public data to be moved into the private sector:
any activity based on leveraging public data to develop commercial products or services should, ultimately, be spun out. This will ensure that the entities undertaking this activity are exposed to the full rigour of market competition. Where sound business models underpin strong commercial propositions, these new private entities will thrive and customers will benefit from businesses that have a strong incentive to meet their (changing) needs.
The taxpayer may even enjoy a windfall gain as they are privatised, and the remaining slimmed down public sector data owners will be able to run leaner operations with fewer sales, marketing and commercial staff (the Ordnance Survey accounts, for example, show around 150 people employed in sales and marketing).
How to liberate the data produced by the Ordnance Survey has always been one of the key issues for opening up government data. Here's what the report suggests:
In most cases the revenue at stake from the pure resale of basic public data is small, perhaps at or below the £1 million per annum mark for the Met Office, Land Registry, Companies House and DVLA. These figures are just a fraction of one per cent of the total cost of operations, and should be manageably absorbed into general public funding for these bodies.
The most frequently cited exception is the Ordnance Survey, where commercial and consumer sales in 2009-10 totalled around £39 million (and about £10 million was revenue from paper maps). As outlined earlier in this note, there are two broad approaches to closing the gap whilst maintaining marginal cost pricing.
The first is to close the gap by extending the public funding already provided to the organisation. For the Ordnance Survey, trading revenues (from both public and private customers) are currently around £113 million, so with no other changes this would mean increasing the £74 million public funding component by around 50 per cent to keep the organisation's finances in balance. The second is to close the gap by switching charges onto activities where the price elasticity of demand is lowest. This would not entirely remove costs from the private sector, but would aim to shift them onto a less distortionary basis. As a result, fees and charges would be less of a barrier for those wanting to make use of the data.
The report makes the following concrete suggestion for how that might be done in the case of the Ordnance Survey:
One potential candidate that merits attention is registration charges, i.e. applying a (statutory) charge every time the master data set needs to be updated due to building or land use changes. One estimate suggests this might level out at an average data charge of around £50 to £100 on each planning application made
I think that's the best idea I've seen so far in terms of finding ways to make the core Ordnance Survey data freely available for use by all under a minimal licence, and yet bring in as much revenue as possible to offset the costs of producing that high-quality data. That's symptomatic of the insightful yet practical nature of the report, and I recommend that anyone interested in open data – which ought to be anyone in business – to read it.