This past week saw another high-profile cloud infrastructure failure as Microsoft's Azure Compute service experienced an extended service outage. As in previous cases, such as last April's Amazon EC2 outage, this event caused another round of hand-wringing about the future of the cloud and the wisdom of using it.
Each time something like this happens, however, I'm always nonplussed that anyone is actually surprised.
Yes, megascale cloud implementations such as those fielded by Amazon, Microsoft, and Rackspace are designed with many layers of overlapping redundancy to prevent this sort of problem, but why is anyone shocked that some combination of unforeseen circumstances could lead to widespread failure? Far be it from me to defend Microsoft here (I mean, leap year -- really?), but you cannot design a perfect, infallible system. It simply doesn't exist.