Riverbed Whitewater review: Data deduplication for cloud storage

FREE

Become An Insider

Sign up now and get free access to hundreds of Insider articles, guides, reviews, interviews, blogs, and other premium content from the best tech brands on the Internet: CIO, CSO, Computerworld, InfoWorld, IT World and Network World Learn more.

Strong data deduplication and built-in support for key cloud storage providers reduce online storage needs while maximizing data protection

Cloud storage seems like such a no-brainer for backups and disaster recovery, it's a wonder that more businesses aren't taking advantage of it. If you're concerned about cloud outages, cloud storage costs, data loss, data security, or the ability to push your nightly backup sets up the Internet straw, Riverbed Technology's Whitewater appliance may make cloud storage easier to embrace.

The Whitewater cloud storage gateway combines data deduplication, local NAS services (supporting both CIFS and NFS), data encryption, and integration with leading cloud storage services in a single virtual or physical appliance. A disk-based target for your existing backup software, Whitewater receives the backup, dedupes and encrypts the data, and transmits only the new bits to your cloud service.

The deduplication reduces your bandwidth requirements, speeds up the data transfers, and keeps your cloud storage needs (and cloud storage fees) to a minimum. A local cache in the Whitewater appliance allows you to restore from the most recent backups without making a trip to the cloud. And because Whitewater encrypts the data using 256-bit AES, it's safe from prying eyes.

The Whitewater appliances scale from a virtual appliance capable of ingesting up to 250GB of data per hour to a 3U unit (the 2010 model I tested) capable of handling up to 1TB per hour. The three physical appliances use RAID-6 for local fault tolerance, with local raw disk capacities available from 3.5TB to 11TB. None of the appliances is limited on the amount of data it can store in the cloud.

Mission: Off-site backup
Over the past few weeks, I put the Whitewater 2010 appliance through its paces with a variety of file types and usage scenarios, and I found it to be an excellent tool for reducing data size and speeding up replication to cloud storage providers. The deduplication engine works very well, and setting up the cloud connections is a snap. Reporting isn't extensive, but it doesn't really need to be. Overall, Whitewater is a very effective tool for saving time and money when storing data or disaster recovery sets off-site with a cloud provider.

The Whitewater appliance allows for both CIFS and NFS shares; my testing focused on CIFS traffic, but NFS storage is done the same way. Admins can carve Whitewater into multiple shares, each with its own folder, share name, and user access rights. While admins won't enjoy the level of access control found in Active Directory, they can define a share as read only, require authentication, and even specify which subnets have access to the share. Currently, iSCSI support is not part of the feature set.

One of my test scenarios was as a simple destination for daily backups. Using Symantec's Backup Exec 2010 R2 installed on a Windows Server 2003 Standard server, I created a nightly backup set composed of line-of-business data, Exchange message stores, SharePoint SQL data, and private user data with a share on the Whitewater 2010 as my backup destination. Backup rates for my data set averaged 500Mbps (over a single gigabit link) even while Whitewater was encrypting the data with AES-256. Deduplication and replication to my cloud storage provider took place immediately, as data was written to local disk.

During my time with Whitewater, I tried to fool the deduplication engine by renaming folders, intentionally copying folders multiple times, and moving folders around inside the server's file system. Regardless of which tricks I tried, the deduplication engine was not fooled. The duplicate data was tracked as file system objects, and no additional storage was used either on the Whitewater appliance or in the cloud. By the end of my testing, I was seeing a 10-fold improvement in storage efficiency -- that is, 90 percent less data to send and store in the cloud. 

Making the cloud connection
With the release of the Whitewater 1.2 software, the solution can now interface with Nirvanix Cloud Storage Network in addition to Amazon S3, AT&T Synaptic Storage as a Service, and EMC Atmos. Riverbed has done an excellent job of making it easy to set up the connector to the cloud provider. I used the Cloud Settings wizard to create a profile for my Amazon S3 account and completed the setup in less than five minutes. One note: Whitewater can interface with only one cloud provider at a time. To replicate data to multiple providers, you'll need a Whitewater appliance for each one.

Although not overly granular, the cloud profile does allow admins to manage some aspects of the cloud interface. I like that I can define a replication schedule, such as when to pause and resume replication or when to suspend replication entirely. You can also set an upper bandwidth limit on the traffic to the cloud, as well as separate bandwidth limits for specific periods of time. For my tests, I imposed an upper limit to the cloud only during business hours.

Using the built-in graphs, it's easy to monitor the effects of the bandwidth management during a typical day. The appliance will try to use all available outbound bandwidth to the cloud, and I discovered this greatly impacted my production Internet connection until I enabled the workday limits. If Whitewater is on a shared Internet circuit, you'll definitely want to set an upper limit.

For a single-purpose appliance, Whitewater includes a fairly robust user management system. In it there is role-based user management, as well as support for local database, RADIUS, and TACACS+ authentication systems. Admins can create user accounts to allow read-only and read/write access to Whitewater's configuration files. Whitewater does not currently support user authentication against Active Directory or LDAP.

Picturing cloud storage 
The reporting system is not flashy, but it's good enough to keep up with Whitewater's network and storage activities at a glance. The storage optimization graph is an easy way to see the total amount of stored data versus the deduplicated data, while the throughput graphs are useful for monitoring the various network interfaces. One report that most admins will want to keep up with is the replication graph, which shows the amount of data transferred to the cloud storage provider and the amount of data waiting to be sent. Whitewater supports SNMP 3, email alerts, alarm thresholds, and Syslog remote logging.

Like the Riverbed Steelhead WAN acceleration appliances I've tested, the Riverbed Whitewater appliance lives up to the hype. Cloud storage provider support is easy to define, and Whitewater does an excellent job of reducing the size of the data set stored in the cloud. The only shortcomings are the lack of support for Active Directory and iSCSI, but neither of these should be roadblocks to implementation. I'll take a 90 percent reduction in storage footprint and data transfer times any day of the week.

Riverbed Whitewater cloud storage gateway

Cost Platforms Bottom Line 
Starts at $7,995 for the virtual appliance editionAny TCP/IP network with either CIFS or NFS clients. Supports Amazon S3, AT&T Synaptics Storage as a Service, Nirvanix Cloud Storage Network, and EMC Atmos cloud storage services. Supports Symantec NetBackup, Symantec Backup Exec, IBM Tivoli Storage Manager, EMC NetWorker, Quest vRanger, and CA ArcServe backup software.Riverbed’s Whitewater appliance is an excellent tool for reducing data size and speeding up replication to cloud storage providers. It interfaces directly with four leading cloud storage services, and setting up the cloud connections is easy. Best of all, the deduplication engine reduced backup sizes by 90 percent in our testing. The two shortcomings -- no iSCSI support and no Active Directory support for user rights assignments -- are hardly showstoppers.
To continue reading, please begin the free registration process or sign in to your Insider account by entering your email address:
From CIO: 8 Free Online Courses to Grow Your Tech Skills
Join the discussion
Be the first to comment on this article. Our Commenting Policies