Why We Need Open Source: Three Cautionary Tales

Open Enterprise mostly writes about "obvious" applications of open source - situations where money can be saved, or control regained, by shifting from proprietary to open code. That battle is more or less won: free software is widely recognised...


Open Enterprise mostly writes about "obvious" applications of open source – situations where money can be saved, or control regained, by shifting from proprietary to open code. That battle is more or less won: free software is widely recognised as inherently superior in practically all situations, as its rapid uptake across many markets demonstrates. But there are also some circumstances where it may not be so obvious that open source is the solution, because it's not always clear what the problem is.

For example, in the field of economics, there is a well-known paper by Carmen Reinhart and Kenneth Rogoff entitled, "Growth in a Time of Debt." The main result is that "median growth rates for countries with public debt over 90 percent of GDP are roughly one percent lower than otherwise; average (mean) growth rates are several percent lower." Needless to say, this has been seized upon and widely cited by those in favour of austerity.

However, as a blog post on the Roosevelt Institute from a few weeks back explained:

In a new paper, "Does High Public Debt Consistently Stifle Economic Growth? A Critique of Reinhart and Rogoff," Thomas Herndon, Michael Ash, and Robert Pollin of the University of Massachusetts, Amherst successfully replicate the results. After trying to replicate the Reinhart-Rogoff results and failing, they reached out to Reinhart and Rogoff and they were willing to share their data spreadsheet. This allowed Herndon et al. to see how how Reinhart and Rogoff's data was constructed.

They find that three main issues stand out. First, Reinhart and Rogoff selectively exclude years of high debt and average growth. Second, they use a debatable method to weight the countries. Third, there also appears to be a coding error that excludes high-debt and average-growth countries. All three bias in favor of their result, and without them you don't get their controversial result.

In other words, once the underlying model and its data were available, its errors were soon discovered. That simply wasn't possible with just the results, which people essentially had to take on trust. Here, then, is a clear case where publishing the code – in this case an Excel spreadsheet – would have had a major impact on how things turned out.

Of course, that shouldn't really come as a surprise, since doing everything out in the open is precisely the scientific method: you have to give full details of your techniques and data so that others can check your working. Except that this rarely happens nowadays. That's not because scientists have suddenly turned evil, or that science itself is in decay, but for a tangential reason that much science uses computers at some point in the analysis of results. However, it's very rare for the underlying code to be released, even if the raw data is. That, of course, makes it practically impossible to check how the final results were obtained. As the relatively simple case of the Reinhart and Rogoff spreadsheet shows, that can hide really major errors that can have huge knock-on effects – in this case, affecting economic policy around the world.

That means we need to re-invent science for the digital age, making it a requirement that any newly-written code used in the preparation of results must be published with the raw data used. If we don't, we risk moving into a period of increasingly unverifiable science, hardly a pleasant prospect.

But there's one more domain where the need for open source may not be apparent, and that is government. By that I don't mean that government needs to use free software – although it obviously does, not just for cost reasons, but in order to maintain its independence from vendors – but that the code it writes or has written for it to function must always be released.

I hadn't really thought about this aspect until I came across an interesting comment on Twitter that mentioned how some UK legislation was being turned into government actions using software I'd not come across before, Oracle Policy Automation Solution for Public Sector:

Oracle Policy Automation is a powerful platform to transform complex legislation, regulations, and policy documentation into executable software. It makes it easy for public-sector agencies to service citizens fairly, efficiently, and consistently while maintaining full compliance with laws and regulations. It also allows agencies to give real-time interactive advice about how policies apply to a citizen's or business' specific circumstance, automate very complex government determinations, and to update systems very quickly when laws and policies change.

Oracle Policy Automation software enables public-sector agencies to effectively manage policies by transforming legislation and policy documents into executable and maintainable business rules using the familiar format of Microsoft Word and Excel document formats. Agencies are able to deploy the rules to different service-delivery or processing channels without modification. This means the same rules support Web self service, call center, back office, and financials. The product includes a pre-built Web service for SOA deployments and a pre-built Web questionnaire application.

In fact, in my innocence, I had never even come across the idea of taking legislation and turning into executable software. Although superficially that seems attractive – law is just a kind of code, so obviously we can just convert it into computer code, right? - in fact it raises some really important issues.

After all, we're talking about interpreting law, which is not always clear in its meaning, and turning it into actions through software. But how do we know that the software interpretation really corresponds to the legal intent? Indeed, how on earth can programmers – with all due respect – pretend to know what legislation actually "means"? There's only one group of people that can do that, and that's judges, whose job is to interpret legislation and define how it should operate in the real world.

So pretending that task can somehow be carried out by code – be it never so clever – is a recipe for disaster. And that recipe is a thousand times more poisonous when closed source code is used to do that, as it apparently is in the UK, because there is no possibility that anyone can check how the translation has been made.

This takes us back to the situation described above for the austerity paper that turned out to be fundamentally flawed once the inner workings were revealed, and to the growing problem with opaque science. If we really must try to cut corners by automating the process of turning legislation into executables, at the very least both the code produced and the application used to run that code must be open source. That would allow others expert in this field to examine both and check that no gross errors have been made. Even then, it is the courts that must have the final say, but at least operating in the open allows clarifications to be sought from them before egregious errors are made by the executables that purport to implement the law.

The last thing we would want is for people to suffer years of unnecessary misery caused by a coding error in an application that is then used blindly. Unfortunately, that seems to be precisely what has happened with the reckless imposition of austerity around the world, whose theoretical underpinning was little more that a screwed-up Excel spreadsheet....

Follow me @glynmoody on Twitter or identi.ca, and on Google+

Find your next job with computerworld UK jobs