The outage in Amazon's Elastic Compute Cloud service last week highlighted the limitations of load balancing and failover systems designed to keep applications running in case of failure. But Amazon isn't the only cloud vendor whose systems can't guarantee 100% uptime.
Building cloud-based applications that can fail over from one data center to another is difficult and may require the customer to have sophisticated technology expertise. Customers may have to work closely with the cloud vendor and purchase third-party load-balancing products to keep applications running in the event of failures like the one that hit Amazon.
[ Also on InfoWorld: Some data irrecoverable after Amazon Web Services crash and Amazon EC2 outage calls 'availability zones' into question. | Get the no-nonsense explanations and advice you need to take real advantage of cloud computing in InfoWorld editors' 21-page Cloud Computing Deep Dive PDF special report. ]
GoGrid, which offers infrastructure-as-a-service computing in a fashion similar to Amazon's, offers service credits to customers when uptime falls below 100%, but that doesn't mean the cloud service never goes down.
"For the service elements we deliver, we're saying that we expect them to be up 100% of the time, and if they're not were going to compensate you," says GoGrid CEO and founder John Keagy. "Things do fail. Customers should not interpret a 100% service-level commitment as a 100% service-level guarantee."
But customers can keep their applications running through downtime if they are willing to put some extra work into it, Keagy says. Amazon customers who didn't have robust disaster recovery and failover plans were more likely to suffer downtime last week than those who planned ahead, he says.
GoGrid's cloud offerings are spread across 11 data centers, mostly run by co-location providers. Customers that want applications to fail over from one data center to another can use global traffic management products made by third parties, Keagy says. Customers can also achieve this extra level of protection entirely through services offered by GoGrid, but this "has to be architected in conjunction with us to get that done," Keagy says.
"That's what infrastructure is all about," Keagy says. "This is not platform as a service or software as a service. This is raw infrastructure that requires the user to have some responsibility for how they implement things."
Amazon lets customers host applications in multiple "availability zones" for an extra fee, but it's not clear how far apart these zones are. Last week, failures hit multiple availability zones.