"How could this happen if we have multiple copies of your data, in multiple data centers?" Google vice president of engineering Ben Treynor asked in a blog posted at the time. "In some rare instances, software bugs can affect several copies of the data. That's what happened here."
Google ended up having to turn to actual physical tape backups in order to restore the data. Ultimately, the company's multilayered data protection did work, but not without leaving thousands of users locked out of their email for days.
Is that a reason to run, arms flailing, away from anything cloud-connected? Probably not. But it is a reason to look carefully at your own data safeguards and think about setting up a backup or offline-access solution now, before an urgent need arises.
"When you look at broad averages, the cloud will have a lot more operational success than you would as an individual," says AlertSite's Ken Godskind. "It's just that when you go to Web scale, the impact of failure is amplified in a much greater way."
Colossal cloud outage No. 4: Hotmail's hot mess
Of course, Microsoft hasn't always provided the greatest advertisement for its big push for the cloud, either. Witness Microsoft's Hotmail service, which experienced database errors of its own at the end of 2010, resulting in tens of thousands of empty inboxes at the turn of the new year.
The error, according to Microsoft, stemmed from a script that was meant to delete dummy accounts created for automated testing. The script mistakenly targeted 17,000 real accounts instead.
It took Microsoft three days to restore service for most of those users. An unlucky 8 percent of affected emailers had to wait an extra three days before their data was back where it belonged.
Even Clippy couldn't smile through a headache like that.
Colossal cloud outage No. 5: The Intuit double-down
Intuit hit a rough patch last year when its cloud-connected services, including popular platforms like TurboTax, Quicken, and QuickBooks, went offline twice within a single month. The worst case was a 36-hour outage in June. A power failure evidently caused things to go haywire, with the company's primary and backup systems getting knocked completely off the grid.
It only added insult to injury, then, when another apparent power failure hit Intuit weeks later. Among other issues, the second outage appeared to cause an abnormally high rate of obscenity-laden shouting.
"Twenty-five hours downtime is hard to swallow," one user tweeted at the time. "Passive, opaque and stiff communication from Intuit didn't help."