Blame game over Amazon outage misses the point

With Amazon.com's most recent outage, cloud detractors and apologists alike missed its real lesson: All technology can fail

1 2 Page 2
Page 2 of 2

What many -- both proponents and detractors of public cloud offerings -- seem to miss is that being in the cloud does not and will never free you from having your own disaster-recovery and high-availability measures in place to defend against the failures and outages that will inevitably occur.

In an on-premise or private cloud infrastructure, that means deploying redundant core infrastructure hardware and maintaining a testing regimen to ensure it's working. In the cloud, you may not be concerned with the hardware, but you need to diversify your workloads across multiple availability zones within a cloud provider or even across multiple cloud providers. Conceptually, it's no different than what you do on-premise, although it may bear little resemblance in execution.

Of course, if you're large enough to have the correct economies of scale, you may find delivering that kind of high availability coupled with the elasticity the public cloud offers may be cheaper and easier to do in an on-premise private cloud -- and I believe that was the thrust of Curtis' blog post.

The real issues: Getting the right tool for the job, learning from experience
That decision, however, is an issue of selecting the right tool for the job. Just as no one screwdriver is appropriate for every screw in existence, any of the public cloud, private cloud, traditional on-premise infrastructure, or hybrids of the three may end up being the right tool for you. The key to making a good choice is truly understanding the pros and cons of each approach and being able to match them to your needs -- areas in which neither the breathless pro-cloud nor staunch anti-cloud narratives can really help.

That's not to say I don't appreciate a vigorous post-outage debate about what went wrong in a given failure and how (or whether) it will be avoided in the future. Though some public cloud providers are less than forthcoming with real details, at least we're aware of the general cause and what was done to fix it.

How many widespread failures of on-premise data center tech (say, bad SAN firmware that leads to catastrophic failures and long downtimes) go unreported simply because nobody has the visibility into the thousands of systems deployed to correlate the failures? That's one luxury that public cloud operators simply don't have -- everyone gets to see their failings -- and, if they're lucky, learn from them.

This article, "Blame game over Amazon outage misses the point," originally appeared at InfoWorld.com. Read more of Matt Prigge's Information Overload blog and follow the latest developments in storage at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.

Copyright © 2012 IDG Communications, Inc.

1 2 Page 2
Page 2 of 2
InfoWorld Technology of the Year Awards 2023. Now open for entries!