InfoWorld review: Data deduplication appliances
Data deduplication appliances from FalconStor, NetApp, and Spectra Logic provide excellent data reduction for production storage, disk-based backups, and virtual tape
During my tests, I stored multiple daily backup sets to a NetApp CIFS volume from each server. Regardless of how or when the deduplication engine analyzed the stored backup files, I never got better than about 8 percent data reduction on the volume. Exchange message stores fared better, showing on average a reduction of 12 percent in disk usage.
I asked NetApp for a possible reason for this and was told the deduplication engine works on 4KB blocks. It seems the Backup Exec family of software inserts metadata into the backup files, messing up the alignment of the 4KB boundaries and making it much harder for NetApp to locate duplicate byte segments. Symantec has made a change in its Enterprise Vault 8.0 to block align with the NetApp engine, so not all Symantec products suffer from this misalignment. Backup software from some other vendors, including CommVault and VMware, keep the 4KB block boundaries in tact.
Admins can define a deduplication policy on a per-volume basis. The deduplication policy engine doesn't provide an overwhelming number of options, but it gets the job done. IT can define a policy to run dedupes manually on demand, or automatically when a specific amount of new data lands on the volume, or based on time of day or day of week. I was able to create a daily dedupe policy for a volume that started at 9 a.m. and stopped at 10 p.m. and ran every hour. Apart from the most extreme cases, this is overkill, but it is available if needed and it worked flawlessly.
IT has two options for managing the FAS2040: Web browser and the stand-alone management console, the NetApp System Manager. While the browser-based management portal was straightforward, I found System Manager much more user-friendly and intuitive, even more so than FalconStor's UI. Both storage controllers were represented in the management utility with each major function broken into separate grouped tasks, making it very easy to locate specific items.
As with FalconStor and Spectra Logic, there isn't a fancy reporting engine. There are, however, useful graphs and data points, such as volume details and space saved, scattered throughout System Manager. NetApp did a good job of organizing System Manager so that the amount of information presented in it is applicable and useful, without going overboard and inundating you with too much data.
I was really impressed with the FAS2040 from NetApp, both in terms of hardware options and manageability. I found the appliance very easy to integrate into my network and very easy to use. Deduplication was easy to manage, and files and folders that typically reside on a file system deduped with great success. My only complaint is with the poor results when deduping Backup Exec backup sets. Of course, no matter which deduplication solution you choose, you'll want to make sure that it works with your backup software.
On this particular iSCSI volume, I was able to achieve 92 percent disk space savings due to the highly redundant nature of the file data.