Exclusive: Data Domain DD460 Restorer puts the squeeze on data
High-compression disk-to-disk backup system could make tape vendors sweat
With the advent of inexpensive disk-to-disk backup systems that offer faster, easier, and more reliable backups and restores than most tape systems, many administrators would like to abandon tape altogether. However, a standard schedule of one full backup per week plus nightly incremental backups uses up a lot of storage space, the kind of space that only tape traditionally offers at a reasonable per-gigabyte cost.
Data Domain aims to solve this with the DD460 Restorer. The appliance appears on the network as a standard NAS device. When a backup application writes data to the DD460, it scans for patterns between the incoming data and data already saved to disk. When it finds duplicates, the DD460 inserts a pointer to the original piece of data rather than saving the data yet again.
I found that this approach yielded a compression ratio of as much as 455-to-1 when performing a backup of data that had changed only slightly since the original backup, which means this 4TB appliance realistically stores 85TB worth of backup data. This high level of compression means that administrators could perform full backups every night without requiring much in the way of additional storage space.
The DD460 provides two levels of data compression. The initial compression, called global compression, generally provides about 2-to-1 ratio compression. The other level is local compression. This approach uses proprietary Data Domain technology to find identical strings of data and yields far higher compression, even on an initial backup. The amount of compression achieved will depend on the type of data you’re backing up.
In my tests at the Data Domain labs, we backed up several types of data, both from a Linux system, using tar, and from a Windows system, using Veritas Backup Exec. The data included a 9.4GB set of Oracle database files, a mix of standard files that would be typical of a file server, and a large file that was all zeros.
The 9.4GB of Oracle files became 4.6GB after local compression, and then 216MB on disk after global compression. When I backed up the same 9.4GB of files a second time, they used up an additional 20MB of disk space. All this compression occurred in real time while files were being backed up at 70MBps.
The set of mixed files was initially 3.2GB, which became 2.5GB after global compression and 1.3GB after local compression. After some of the files in this group were changed, a second backup used an additional 11MB of disk space.
The large file of all zeros produced the most dramatic compression ratio, although the users would see this great a compression ratio only when a file contained large blocks of empty space. The file of all zeros was originally 10GB and occupied about 1MB on disk, a total compression ratio of 10,334.5-to-1.
The average compression ratio for all data in my tests in a first backup was 8.7-to-1; on the second backup of the same data, after changes had been made to some files, the average was greater than 400-to-1 compression. The DD460 achieves these levels of compression without undue processing time: Backups from a Linux system to the DD460 ran at 65MBps, and a Windows system backed up at 45MBps.
The system is very well engineered physically, with clean airflow and a nicely designed set of rack-mount rails that should work on pretty much any manufacturer’s rack.