Storage acrobatics with PDFs

High volume storage of PDF documents is a good target for compression by software such as DocuLynx' Mercury.


Adobe PDF documents are used in their millions every day. PDF is the de facto standard for the screen display of finished-form documents that can be displayed on any computer system and not just by the originating application on the originating O/S machine type. Their massively popular use means that they are stored on disk in massive quantities too. DocuLynx has a software product, Mercury, which can compress PDF files and store them in a greatly reduced space.

DocuLynx is a VAR of Mercury Software and made a brief visit to Storage Expo. The Mercury product contains a PDF file parsing ability. It takes individual PDFs and puts them into an archive. The JBIG2 scanned image compression algorithm is used for this. The PDF information is kept intact but compressed. It's easy to imagine the scope for PDF file compression. For example, a significant proportion of a PDF document can be white space or flat colour areas.

Multiple PDF documents can be cached together with only one instance of repeated PDFs stored. The storage of thousands of PDF documents with Mercury is vastly more efficient than storing them in native PDF format. You might achieve compression ratios of 2:1 or greater. More than half the information in a PDF document can be duplicated and, in principle, is redundant and can be replaced by pointers to the first instance of that information in the document.

Mercury's document throughput is 250 PDF pages/second - so it is a high volume product.

If you are an organisation with thousands of PDFs being created and handled, for example, as financial statements of some kind, then Mercury might well be worth looking at.

DocuLynx is a long-established and successful US company with a background in document scanning and storage. It has recently expanded into Europe and has two local partners: CD Team of Henley-on-Thames; and Desktop Systems of Ireland. Give them a call if your PDF storage needs are making you perform acrobatics.

