The NetApp Storage Efficiency Guide
7
6 SENSIBLE DISK-TO-DISK DATA BACKUPS
Data backups are a fact of life in enterprise data centers. Like that trip to the dentist, the experience can
sometimes be agonizing, but the alternatives are far worse. The longer you postpone your trip to the dentist
(or attending to your data backups), the more likely it is that a much more painful event will eventually occur.
NetApp helps take the pain out of data backups with SnapVault software.
Storage efficiency again comes into play with SnapVault. First, a complete copy of the primary data is stored
on the SnapVault backup system. This initial, or baseline, transfer is much like a level-zero backup to tape.
Each subsequent backup, however, transfers only the data blocks that have changed since the previous
backup. NetApp uses a variation of Snapshot technology to create these transfers only the new data
blocks are sent to the SnapVault system. This means that each subsequent backup copy consumes only an
amount of disk space proportional to the differences between it and the previous backup copy. To the user,
however, each backup session is virtualized to appear as though it were a complete, level-zero backup
copy, greatly simplifying the process of data restoration.
For example, if SnapVault backed up a 100GB primary data set for the first time, it would consume 100GB
of disk space on the SnapVault system. Over the course of several hours, suppose that users change 10GB
of data on the primary file system. When the next SnapVault backup occurs, SnapVault writes the 10GB of
changes to the SnapVault system and creates a new Snapshot copy. At this point, the SnapVault system
contains two Snapshot copies; one contains an image of the file system as it appeared when the baseline
backup occurred and the other contains an image of the file system as it appeared when the incremental
backup occurred. The copies consume a combined total of 110GB of space on the SnapVault system, but
they are the equivalent of two complete 100GB backup copies. System administrators can refer to any of the
backup instances to retrieve their files and can easily store dozens of backup images in a reduced amount
of space.
Like SnapMirror, data deduplication can be combined with SnapVault for additional savings. The result is a
dramatic reduction in the physical storage requirement for disk-to-disk backups. Benefits include fast and
easy restoration of files from disk and the option to retain backups on disk for longer periods of time because
each subsequent backup requires very little disk space.
USER CASE STUDY 2: NETAPP DEDUPLICATION AND FLEXVOL
A major multimedia company is also a long-time NetApp customer. Among other applications, this company
has three SQL servers with a total capacity of 2TB. These databases are considered essential to operations
because they contain customer billing information. From the main data center, all three SQL databases are
backed up nightly to a FAS270 in a second location. From that location, the three databases are again
backed up to a FAS3050 in a third location for disaster recovery and archiving.
This company wanted to eliminate all tape backups and instead use NetApp for disk-to-disk backup and
disaster recovery. Because of the large database size and the requirement to keep 16 backup copies online
at all times, they were also interested in reducing disk space requirements.
Proof-of-concept testing with NetApp deduplication validated that 40% to 50% volume space savings would
occur consistently when deduplication was performed after the second nightly backup. Once the concept
was proven, an automated script was developed. All database backups were saved to FAS3050 volumes in
pairs. After the second nightly database backup, deduplication is run on the volume and a check is made to
determine the new (reduced) volume space required. The volume is then resized automatically using
FlexVol. This process continues until 8 volumes are created, with a total of 16 database copies. After that
point, on subsequent backups, the first volume is deleted and a 17th volume is created, and so on.
The results of this implementation were a completely automated database backup process and a 40%
reduction in disk requirements from 32TB to 19TB.