Open Source and Open Research Computation

Free software was inspired in part by the scientific method, but it is only now that science is starting to apply free software's key insights. For example, opening up the source code would imply that scientific papers should be made freely...

Share

Free software was inspired in part by the scientific method, but it is only now that science is starting to apply free software's key insights. For example, opening up the source code would imply that scientific papers should be made freely available for anyone to read and use. And yet it is only in the last few years that this open access approach, as it is called, has made significant headway against the prevailing proprietary system, which says that you have to pay – often handsomely – if you want to read a paper.

Beyond open access, which is only about a certain kind of opening up, lies true open science, which implies providing things like raw data alongside polished publications. The growing success of the open data movement – notably with the opening up of government data stores – has provided a handy impetus to this in the world of science.

But there is yet another area of scientific endeavour where openness is rare: ironically, the world of scientific software. You might think that using open source licences would be an obvious thing to do, since it would allow others both to examine/check your working and build on it. But strangely many scientists are reluctant to allow this kind of scrutiny or sharing.

Hoping to change that is a new journal called Open Research Computation. Its Editor-in-Chief is Cameron Neylon, probably the leading exponent of open science (and with a fine blog called "Science in the Open".) Here's what he wrote there about the new journal:

Computation lies at the heart of all modern research. Whether it is the massive scale of LHC data analysis or the use of Excel to graph a small data set. From the hundreds of thousands of web users that contribute to Galaxy Zoo to the solitary chemist reprocessing an NMR spectrum we rely absolutely on billions of lines of code that we never think to look at. Some of this code is in massive commercial applications used by hundreds of millions of people, well beyond the research community. Sometimes it is a few lines of shell script or Perl that will only ever be used by the one person who wrote it. At both extremes we rely on the code.

And as he notes, it's asking a lot from its potential contributors:

The submission criteria for ORC Software Articles are stringent. The source code must be available, on an appropriate public repository under an OSI compliant license. Running code, in the form of executables, or an instance of a service must be made available. Documentation of the code will be expected to a very high standard, consistent with best practice in the language and research domain, and it must cover all public methods and classes. Similarly code testing must be in place covering, by default, 100% of the code. Finally all the claims, use cases, and figures in the paper must have associated with them test data, with examples of both input data and the outputs expected.

That is, the code must not only be open source in the usual sense of the term, but come with documentation and test data; ambitious...

Here's what the journal's FAQ says about its aims:

Open Research Computation publishes peer reviewed articles that describe the development, capacities, and uses of software designed for use by researchers in any field. Submissions relating to software for use in any area of research are welcome as are articles dealing with algorithms, useful code snippets, as well as large applications or web services, and libraries. Open Research Computation differs from other journals with a software focus in its requirement for the software source code to be made available under an Open Source Initiative compliant license, and in its assessment of the quality of documentation and testing of the software. In addition to articles describing software Open Research Computation also welcomes submissions that review or describe developments relating to software based tools for research. These include, but are not limited to, reviews or proposals for standards, discussion of best practice in research software development, educational and support resources and tools for researchers that develop or use software based tools.

Given its background, and the people behind it, it will come as no surprise to learn that Open Research Computation is open access:

All articles published in Open Research Computation are open access, which means they are freely and universally accessible online, and permanently archived in an internationally recognised open access repository.

If open access is new to you, you might be interested in the following explanation of how Open Research Computation covers the costs of putting together a journal. For just as free software companies need business models, so do open access journals – even purely digital journals cost money to put together:

As the cost of peer reviewing, editing, publishing, maintaining and archiving articles is not recouped through subscription charges, a standard article-processing charge (APC) is levied on all articles that are accepted for publication. The APC is a flat charge, and no additional costs are incurred, for example, by the inclusion of color figures. The current APC for Open Research Computation is £995/$1540/‚¬1155.

In other words, you or your institution pays for you to publish, which means that nobody pays to read. And if you're wondering what happens if you are at an institution in a developing country:

In cases where neither the authors nor their institution or funder are able to pay the APC, a discount or waiver may be granted. There is currently an automatic waiver for authors from low or lower-middle income countries (according to World Bank criteria).

Given the role played by science in the original formulation of free software's key ideas, it's good to see science coming home in this way. Let's hope the journal flourishes and its message about opening up scientific software spreads.

Follow me @glynmoody on Twitter or identi.ca.

Find your next job with computerworld UK jobs