Review: ExaGrid aces disk-to-disk backup

ExaGrid's unique scale-out grid architecture makes for powerful, scalable, and uncomplicated disk-based backup and deduplication

For enterprises seeking to escape the challenges of managing and maintaining tape backup architectures, disk-to-disk backup has been nothing short of a godsend. By replacing tape with disk for nightly backups and relegating tape to a long-term archival role, organizations of all sizes can shrink backup windows and provide near-instantaneous restores. While simple direct-attached storage may fit the bill for smaller organizations, larger enterprises wrestling with the task of protecting terabytes of data find themselves looking for functionality that plain old disk can't provide.

That's where deduplicating backup appliances really shine. While there are a number of well-known vendors with very strong product offerings in this space (EMC Data Domain and Quantum, to name two), ExaGrid's unique scale-out grid architecture and truly refreshing support model set it apart from the pack and place it in a class of its own.

To say that deduplication technology is "hot" is something of an understatement. With rapidly growing mountains of data, leveraging dedupe in backup (if not primary storage) has almost become a necessity. However, as sexy as deduplication tech may be, it's reached a point where the major dedupe vendors are, by and large, getting the same data reduction results from their deduplication engines. Today the differences reside mainly in the impact the deduplication engine has on backup and restore performance and how well the solution scales as backup data inexorably grows. This is where ExaGrid has chosen to invest the bulk of its R&D.

Scale-out vs. scale-up
First, the ExaGrid EX series uses a scale-out grid architecture versus the scale-up architectures adopted by many of its rivals. That architecture allows you to combine multiple EX-series appliances -- each equipped with dedupe and network capacity matched to its storage capacity -- into a linearly scalable grid. This is important because it handily deals with the one true constant of any storage architecture today: rampant growth.

Because scale-up architectures are typically dependent on static controller resources that are sized when the system is initially purchased, an unexpected spate of growth might force you to replace those (often very expensive) controller resources well ahead of when you might have expected. With ExaGrid's scale-out approach, you simply add another appliance to the grid and scale your storage capacity and backup performance at the same time. It's about as close to pay-as-you-go as you'll get this side of the cloud.

Inline vs. post-process
Second, on the deduplication front, ExaGrid's EX series uses a post-process deduplication model. This means that the backup data is written to the device in its fully "hydrated" form and is deduplicated after the backup process is complete. This is in contrast to the more popular inline deduplication model, which sees incoming data deduplicated as it is written to the device.

A few years ago, I would have unapologetically derided ExaGrid for taking the post-process approach. From a storage efficiency standpoint, it's clearly a poor choice: Storing the most recent backup in its native, undeduplicated form, then creating a separate deduplicated copy would seem to nearly double the amount of storage the appliance would need to do its job. No surprise, it typically does. That's neatly reflected by the fact that that ExaGrid's model numbers imply a capacity half that of the device's actual usable capacity (the EX1000 and EX13000E have 2TB and 26TB usable capacity, respectively).

If it's bad for storage capacity, why use post-process dedupe versus the much more miserly inline dedupe? One reason is that deduplication requires a whole lot of compute and disk performance to do effectively. Realistically shooting for 20:1 dedupe (and beyond) means you must devote a large amount of compute resources to finding every bit of duplicate data and removing it. If you're going to provide inline dedupe capability without the dedupe engine becoming a bottleneck to backup performance, you need to throw a lot of expensive, high-end compute hardware at the problem. From ExaGrid's point of view, the cost of the additional storage required for post-process dedupe was easily offset by the cost of implementing the specialized compute performance necessary to do it inline.

InfoWorld Scorecard
Interoperability (10.0%)
Scalability (20.0%)
Performance (20.0%)
Value (10.0%)
Management (20.0%)
Data deduplication (20.0%)
Overall Score (100%)
ExaGrid EX Series 10.0 10.0 9.0 9.0 8.0 9.0 9.1
1 2 3 4 Page 1
Page 1 of 4