De-duplication reduces data by 20 times or more

Performance overhead can be a major downside with this technology.


As 2007 approaches, many storage managers are working on next year's budget and one item that is sure to appear on the shopping lists of many is a new virtual tape library (VTL). Yet with VTLs rapidly maturing, a new feature called data de-duplication is one that users should seriously examine.

Data de-duplication significantly increases the amount of data that a VTL can store. Unlike data compression, which stores the same amount of data in a smaller space, data de-duplication identifies the same blocks of data from different backup streams and stores them as one.

VTL vendors that support data de-duplication report that data reductions of 20:1 or greater are possible. While not everyone will see results like this, de-duplication starts to give VTLs capacity-like features that you normally only find in tape libraries.

Yet performance overhead is a major downside associated with this technology. Data de-duplication analyzes blocks of data in the backup job to determine if they match existing blocks of data before storing a new block. However, executing this task during backups can slow backups to the point where they run as slow as tape backups.

To address this, some vendors offer a post-processing option. In this mode, data is backed up in its native format and only after the backup is complete does the VTL de-dupe the data. Though processing the data post-backup increases the VTL's disk capacity requirements, the performance overhead is moved to off peak hours.

Many storage managers are anxious to deploy VTLs and data de-duplication technology is critical if one hopes to eventually replace tape with disk. But with the overhead that data de-duplication introduces, managers should first verify that the VTL they want offers the options they need so their planned 2007 purchase does not turn into a pumpkin.

Jerome Wendt is the president and lead analyst with DCIG Inc. He may be reached at [email protected]

"Recommended For You"

EMC rolls out large virtual tape library Quantum hopes to shift virtual libraries