Where cloud backup fits the bill

Cloud backup might not replace tape, but it can be useful in a wide range of places -- if you know how to use it

A couple weeks ago, I delved into the challenges of trying to use the cloud to replace tape for traditional on-premises backup. Although cloud backup can replace tape for small businesses or larger IT departments without long-term retention requirements, enterprises will rarely be able to consider cloud backup as a tape replacement.

When you do the math, it's pretty clear why: A typical enterprise that tries to replace its tape backup systems with cloud storage will either fail to find the connectivity it needs or end up paying substantially more money in the long run as cloud-hosted archival backups and backup-related connectivity fees pile up.

However, simply because the cloud isn't quite ready to replace tape for all time doesn't mean it can't still play an important role in protecting an enterprise's data. Seeing what role cloud backup can play involves putting together a short list of what cloud storage does and doesn't do well. Here, I cover what Amazon.com S3 and Glacier services can do, but much also applies to other cloud storage providers. Don't assume Amazon is the right vendor for you, but it helps to have a clear example.

Reliability: A plus for the cloud
It might seem strange to lead with reliability when talking about the cloud. You're more likely to hear horror stories about cloud service reliability around outages and lost data.

However, when you consider the durability and availability of object storage services such as Amazon's S3 (versus less reliable primary storage services like EBS or EC2 instance storage), you'll see a different story. Amazon says that S3 is designed to provide a data durability of 99.999999999 percent per year. This means that 0.000000001 percent of the objects you store with it will be lost per year. In an example Amazon provides, a company storing 10,000 objects in S3 can expect to lose one object every 10 million years. Or if you're not big on statistics: It's incredibly durable compared to just about any type of storage media you can imagine or could economically construct on your own.

However, it's true that the same cannot be said for data availability. S3 is designed for a data availability of 99.99 percent, and Amazon starts issuing service credits below 99.9 percent availability. Using that lower SLA-driven bar, that means you might statistically expect to not be able to retrieve a given S3 object for a bit less than nine hours per year. Given that many organizations might always experience similar wait times to retrieve a backup media from an offsite vault, that's not bad -- but don't forget it exists.

Connectivity: A negative for the cloud
As often happens, one of the cloud's greatest strengths is also its greatest weakness. When it comes to backup and disaster recovery, the fact that the systems you're using in the cloud are, in all likelihood, geographically far away from you (and probably dispersed throughout the country or across the globe) is hugely useful. However, when you have terabytes of data sitting in your premises that you want to get to those cloud repositories, the distance between you and the cloud can be its Achilles' heel.

To help combat the performance and reliability challenges posed by using the open Internet to access a cloud storage provider, you can get direct connectivity to the networks of some major cloud storage providers. Amazon does this by allowing you to rent a port on a switch in one of the major carrier hotels that its network touches, using a product called Direct Connect. Pricing varies based on bandwidth, but a gigabit port will set you back about $216 per month. The transfer into AWS is free, and the transfer out of AWS is typically 2 or 3 cents per gigabyte; for a backup application, you'll trip over this in a big way only when you perform large restores.

That pricing seems absolutely amazing until you consider that all you've done is rented a port on Amazon's router. You still need to get access to that router. It might not be a huge challenge if you're located in a major metro area with a mature fiber infrastructure, but it could be cripplingly expensive or simply unavailable if you're out in the sticks.

Restore time: It depends
The last big-picture detail to consider when looking at using cloud storage for backups is how much time it will take you to restore the data when you need it. A big piece of this relates back to the connectivity issue -- you'd certainly expect to experience different restore times if you're using a 1Gbps Direct Connect fiber connection versus a commodity 20Mbps SHDSL Internet connection.

However, it matters less how much bandwidth you have at your disposal and much more about what you're actually storing in the cloud. For example, imagine you have a virtual machine running in your infrastructure to host a database that's absolutely critical to your business. With the operating system, database engine, related software, and data, the SQL server VM weighs in at about 100GB. However, the database might take only about 20GB of that. Furthermore, a day's worth of periodic transaction log backups takes only a few hundred megabytes.

Clearly, if you're shoveling a backup of the entire VM into the cloud every night and expect to need to restore the whole thing whenever you have a problem with it, you probably won't be happy with the results unless you have a lot of bandwidth available. However, if you're using other means (say, a weekly tape backup or a periodic clone of the VM to different storage media) to capture the machine's general configuration, then using the cloud to back up the variable, mission-critical data, the restore times might be much more acceptable -- even on a relatively slow connection.

Putting it all together
To be sure, there are many other factors to consider when deciding whether the cloud can play a role in your backup regimen and what value it might deliver over your existing backup infrastructure. The most important thing to do is to remember that cloud storage has its own pros and cons and should never be thought of as a simple replacement for any other kind of storage. If you try to use it in the same way that you might use tape or a local disk-to-disk backup system, you'll almost invariably be unhappy with the results.

However, if you focus on using the strengths of cloud backup -- specifically, its high degree of reliability (including geographic redundancy) and comparatively fast restore times for small data sets -- you may find a good way to use the cloud, get tangible benefit from doing so, and save money all at the same time.

This article, "Where cloud backup fits the bill," originally appeared at InfoWorld.com. Read more of Matt Prigge's Information Overload blog and follow the latest developments in storage at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.

Copyright © 2013 IDG Communications, Inc.