National institutions urgently need to simplify the file formats in use for storing documents, or put heritage at risk, the coordinator of a European Union document archiving project has warned.
Adam Farquhar, head of e-architecture at the British Library and director at the Planets project, told ComputerworldUK that rapid changes in IT were "putting our digital heritage at risk" as certain legacy technologies become obsolete and no longer searchable or accessible.
The diversity of digital material, as well as frequent changes in computer technology presents challenges to archives and libraries, as they attempt to provide long-term access to information.
The British Library is a member of the Planets project, which has been set up to address the issue of digital preservation. The project consortium consists of a number of European national libraries, archives, universities and technology manufacturers such as IBM and Microsoft.
Planets estimates that EU member countries produce around five billion documents per year. Of these, around two million documents are held in formats that constitute a long-term preservation risk, according to the Planets consortium.
The British Library is legally responsible for keeping archives of all literary and factual material published in the UK. To date, the library holds over 13 million books, 920,000 journal and newspaper titles, 57 million patents, and three million sound recordings.
The British Library has been an early adopter of digital archiving strategies, but it faced a number of challenges in dealing with its array of digital documents. “There is a vast amount of digital material dating back to the 1980s in a variety of formats, and we have to ensure it can be found and read,” Farquhar explained.
The British Library’s document archiving efforts has "set a benchmark" that is being followed by a number of other European institutions, he said. One such institution is the National Archives, which is working with Microsoft on a project that will attempt to translate files in obsolete formats into versions that can be read by current and future technology.