Cloud storage: The final nail in tape's coffin

Amazon.com's new long-term storage offering isn't right for disaster recovery, but looks to be a perfect fit for archiving

More than two years ago, I wrote about the persisting value of tape as an archival backup medium. At the time, the LTO-5 tape standard had recently been announced in an effort to stave off ever-increasing pressure from disk-to-disk backup products. Since then, a lot has changed. Although the nearly twice-as-capable LTO-6 tape standard will hit the market soon, disk-to-disk backup has become a much more popular way to back up the enterprise. Effective deduplication tech, high-performance scale-out disk backup appliances, and the challenges of effectively backing up and restoring virtual machines have all helped disk backup push tape to the margins.

But despite its falling costs and increasing performance, disk backup still can't do one thing that tape is especially good at: Get shipped to a vault, sit on a shelf for a long time, and remain ready to be restored from years later.

[ Discover the key technologies to speed archival storage and get quick data recovery in InfoWorld's Archiving Deep Dive PDF special report. ]

At their core, hard disks are still mechanical devices with components that aren't well suited to being tossed into the mail and sitting dormant for years. That's why you still usually find tape used for monthly and yearly archives, even in organizations that have switched to disks for daily backups and short-term retention. That's especially true in regulated industries like health care and banking where retaining those archives is mandatory.

The conventional wisdom is that tape is therefore here to stay, though unloved and hidden in warehouses far, far away. That conventional wisdom may be wrong. Amazon.com may have what will become the final nail in tape's coffin: Its new cloud-based archival storage, dubbed Glacier.

Much like Amazon.com's six-year-old S3 (Simple Storage Service), Glacier is a cloud-based object storage service that can hold whatever kind of data you feel like uploading into it, regardless of its format or the software used to generate it. Like S3, Glacier has the same claimed 99.999999999 percent data durability and 99.99 percent data availability design. But beyond that, Glacier differs substantially from S3.

First off, it's crazy cheap to upload and store data with Glacier: just 1 cent per gigabyte (that's 8 to 20 percent the cost of S3, depending on the amount of data), and inbound data transfer is free. That comes to about $122 a year per terabyte for offsite cloud storage that's designed to be nigh-immune to data loss. Even the "well, I can buy a 2TB drive at Staples for $130" argument that's often heard during corner-office enterprise storage discussions can't touch that kind of economy.

When I saw Glacier's pricing, I immediately wondered how Amazon.com could possibly make money from it. Although Amazon.com has not been forthcoming about how Glacier works behind the scenes, an anonymous poster purporting to be an ex-Amazon.com employee has shed some light on what Amazon.com may be up to. Instead of building a massive online storage infrastructure like the one that back-ends S3, Amazon.com is apparently using the cheapest, slowest disk money can buy. It's built an extremely low-power storage grid in which most disks are typically spun down and activated only during data retrieval operations -- both decreasing operational costs and increasing the lifetime of the disks. Although there's no way to know for sure whether that poster is correct, such as approach makes sense.

So, given the incredibly low cost and high reliability, there must be a catch, right? Yes, there is.

Although you pay very little to store data in Glacier, you could pay quite a lot to pull it back out. The pricing structure is fairly complicated, but basically works out to a charge based on the average amount of data you move out of Glacier per day throughout a month, minus a free allowance equal to 5 percent of the total data you're storing. If data retrievals are relatively rare or small in comparison to the total storage, this might not add up to very much. But pulling all your data out on a single day (such as for a disaster-recovery effort) could end up being extremely expensive.

And that's not the only catch. Not only are there substantial charges for retrieving large amounts of data, there's also a substantial delay -- usually around four hours -- between when you request your data and when you can actually download it. This fact lends more credence to the idea that Amazon is using a primarily dormant and low-speed disk architecture to make Glacier work.

So, given the retrieval time and cost limitations, it's clear that Glacier will probably not be a good fit for replacing your first-tier backups -- on-premises disk-to-disk is still the best technology to back up and restore all kinds of data quickly and easily.

However, Glacier may be a compelling counterargument to tape for long-term archival purposes. At the end of the day, it is little different than offsite tape rotations serviced by companies like Iron Mountain, but with the benefit of requiring little to no hardware investment and substantially easier self-management. Either way, you're getting nearly bulletproof offsite archive storage managed by a third party.

That archival market is precisely what Amazon.com is hoping to crack. But I wonder whether it could lead enterprises to stop using tape completely, especially if Amazon.com at some point introduces a faster recovery method for disaster-recovery scenarios (which you'd pay more for, of course).

As much credit as I gave to tape two years ago (and still do today for many of the same reasons), it seems clear that Amazon.com's approach will end up being the last nail in tape's coffin. Today's Glacier places that nail in position. Will tape simply disappear? No -- some organizations will never be able to trust the cloud with their data, and others will need to access their archives too frequently to make Glacier economical. However, I can definitely see a day when tape is the exception rather than the rule for archival storage.

This article, "Cloud storage: The final nail in tape's coffin," originally appeared at InfoWorld.com. Read more of Matt Prigge's Information Overload blog and follow the latest developments in storage at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.

Copyright © 2012 IDG Communications, Inc.