How to protect your data when a cloud service vanishes

Cloud services can -- and do -- fail, taking your data with them. Here are some of the dangers of trusting your data to the cloud and how to protect yourself

More and more, we rely on Web services as a matter of course. The key word is rely: We assume that the data we upload to, say, a photo-hosting account or blog service today will still be there tomorrow. In large part, that's because we assume the services themselves will still be there tomorrow.

But over the past few years, we've seen plenty of examples of sites that are here today and all-too-gone tomorrow -- for example, Friendster (which dumped user data for a redesign in May) and GeoCities (which shut down in 2009).

[ Also on InfoWorld: The 10 worst cloud outages (and what we can learn from them). | In the data center today, the action is in the private cloud. InfoWorld's experts take you through what you need to know to do it right in our "Private Cloud Deep Dive" PDF special report. | Also check out our "Cloud Security Deep Dive," our "Cloud Storage Deep Dive," and our "Cloud Services Deep Dive." ]

In other words, nothing lasts forever. The Web services that we entrust with our data can -- and do -- vanish. And when that happens, you need to have a plan. In the following pages, I'll take a look at some cases where user data was lost or endangered, how the companies (and their users) handled the situation, and what you can do to keep your own information safe.

Don't let this happen to you

Unfortunately, there are plenty of examples of services that have shut down, changed hands or simply lost their data.

MySpace. The slow death and muddled rebirth of MySpace -- once a fiercely popular social network, overshadowed by the rise of Facebook -- raised a lot of questions about what would happen to existing users' data and whether or not there would be an easy way to bulk-export any of that information.

MySpace did set up what has been described as a "data-portability initiative" back in 2008. But this seemed not so much for the sake of exporting data from MySpace as allowing consistently reused contact information to be automatically filled in across sites. Worse, the terms of service for MySpace developers explicitly forbids creating applications designed to export user data to another service. That hasn't stopped people from creating scrape tools for MySpace such as Make Data Make Sense's blog-export utility.

Google Videos. After Google's acquisition of YouTube in 2006, Google Video seemed as redundant as a second navel. By 2009, the ability to upload new videos was shut down, although concerted protest by users kept Google from shutting the service off entirely so that any videos still there could be archived manually. Those who had spent money on Google Videos' download-to-own/-rent program found access to their purchased content gone, although those with outstanding credit in the system could have that transferred as funds to Google Checkout. (Later, Google announced it would also offer credit card refunds; in April 2011, it announced that it was keeping Google Video content up indefinitely, until all the remaining videos could be moved to YouTube.) As with some other closed Web services, the issue wasn't just the content but the existing user investment in the site, in multiple senses of the word "investment."

Sidekick. Back in October 2009, around 800,000 T-Mobile users who owned the Sidekick phone were in for a rude shock when the servers holding their personal data -- including email and contact information -- went down. It was originally reported that the data was lost for good, although the majority of the data was later restored. Not that it made for any less of a black eye for T-Mobile and Microsoft (which were managing the servers that kept the Sidekick data). Worse, users had no short-term recourse for recovery other than whatever data might have been synced to their computers.

The data service for Sidekick was discontinued for good on May 31, 2011. According to a statement by Microsoft, T-Mobile provided "an enhanced Web tool ... on myT-Mobile.com to easily export their personal data, including contacts, photos, calendar, notes, to-do lists, and bookmarks, from the Danger service to a new device, computer, or a designated e-mail account." If they had provided something so convenient during the earlier data outage, or as a routine way to allow Sidekick users to keep their data intact, the wailing and gnashing of teeth might not have been as loud.

Blogging and Web-hosting services. With blogging and free websites now throwaway commodity offerings, it's not surprising when these services bite the dust. GeoCities, an artifact of the Web's earliest commercial days, was widely lamented when its plug was pulled in 2009. Yahoo did little on its own to preserve the sites, but there were third-party efforts to save the contents of Geocities. Also, Windows Live Spaces was shut down in March 2011, ai which time users were given the option to migrate to WordPress. And as of May 24 this year, Yahoo's MyBlogLog was also canned; there are, however, tutorials on how to migrate your data from it.

Lala.com. For the users of Lala.com, the problem was a little more complex. The short-lived online music service, which allowed users to cheaply purchase streaming access to music, was bought out by Apple in December 2009. Users who had existing credit with the service were allowed to transfer those credits to iTunes, but any purchased streams were gone for good. No provision existed for, say, allowing legal MP3 downloads of the purchased streams. (Blame the thicket of restrictive licensing agreements that automatically spring up around any online media service and the fact that music isn't really "purchased" online but merely licensed.)

The fate of Lala brings up an interesting question. Given how many media services are offering "rental" rather than "purchase" models for their offerings, at what point will people feel an entitlement to that data as theirs? And given the voracity with which companies can gobble each other up, how willing should people be to pay money for access to something that could dry up overnight?

These are not questions that have set answers, since they deal with conceptual changes in the nature of the services people consume, and are heavily affected by the reputation of the company in question. For example, few people expect Amazon to go out of business anytime soon, so there's not the same hesitancy about buying books on the Kindle as there would be about streaming music from a fresh young startup.

Look for these features

If you're currently deciding whether to use a specific Web service, it helps to know how it will handle your data and if it can provide you with ways to rescue your data or move the information offsite. There are several things to look for.

Data is available in open formats for easy download. The best sign that a website or service has the preservation of its users' data in mind is the ability for users to make a backup copy of their data through the service itself. If there's no back-end tool for downloading copies of your content, you may be forced to scrape the data manually, so anything that saves you the trouble of having to do so is worth noting. The wiki-creation site Wikia.com, for instance, lets you save whole wikis or individual pages into plain text files either for archiving or offline editing.

Interestingly, Google has been making major strides in this area. When it recently started beta-testing its Google+ social network, it added extensions to allow personal data (contacts, circles, etc.) to be exported via Google Checkout. The real test of such a feature, though, is how useful it'll be to transport your data into other services.

Data tools are provided by the service or third parties. If you don't have direct access to your data through the service's own Web interface, the next best thing is an application that can pull that data for you via one of the service's APIs. You might have to do some programming on your own to take advantage of those APIs, but it's a good idea to look around first -- someone else out there might well have done that work for you and made the results freely available.

Andrew Reichman, principal analyst at Forrester Research, says any service you use should be considered proprietary, even if the provider of the service advertises its own exit strategy. In other words, take any claims about data portability with a hefty chunk of salt. "Even with standards [for data interchange], you are still at the mercy of the administrators and policies of the company operating the equipment on your behalf."

Terms of service. The ToS for almost any service these days is worded to within an inch of its life, with almost every conceivable aspect of the service's functionality covered. "Paying close attention to the SLAs [service-level agreements], contracts and penalty structure related to non-performance of SLAs is critical," says Reichman. "Having an exit strategy, or at least some discussion about what would happen in the event the customer wants to pull out or the vendor cancels service, is an important preliminary step to take, prior to committing to a given vendor." The fewer details about such things in the ToS, the more wary you should be.

George Hamilton, an analyst at Yankee Group, is even more insistent on this point. "Caveat emptor," he says. "Know how the service provider protects stored data and data in motion, and how it is backed up."

This is where services can afford to compete most aggressively: by allowing customers more freedom of movement with their data, even if it seems counterintuitive at first to let them leave. "Vendors should sell their functionality, not create lock-in with technology," says Hamilton, noting that the general movement in the industry is toward open standards of one kind or another.

Reichman, however, disagrees. "The most likely [scenario] is one vendor's proprietary structure becoming a de-facto standard that other vendors follow," he says.

Watch for warning signs

Is it possible to tell ahead of time if a service's plug is about to be pulled? Sometimes the best places to look for signs of that happening are not on the service itself.

The ArchiveTeam Web site maintains a list of sites that are in danger of being shut down or are already dying. If you use a site listed there (under the heading "Watchlist"), it's probably a good time to think about taking your data elsewhere or, at the very least, backing it up somewhere solid.

Reichman advises looking at the company's numbers. "You can't always discover that a potential vendor has financial problems, but some issues can be uncovered with a bit of due diligence in financial statements, if available, or funding history and any news stories about the vendor," he says. "Rumors of impending acquisitions or divestitures, layoffs or strategy shifts are all signals that there may be trouble looming."

Both Reichman and Hamilton say there may be few outward warning signs, even financial ones. "Companies in fiscal trouble typically don't pre-announce that kind of trouble," notes Hamilton. "You need to be proactive. If they're a public company, you can see their financials. If not, you should still watch to see if they're in the news. If you have questions about their viability, don't use them in the first place."

That said, again, it's hard to say no to a particular service if job requirements or peer pressure require you to do so, especially without viable alternatives. For a time it was difficult to spurn Facebook, for instance, despite its lack of data portability and questionable privacy practices -- everyone used it. Now that wall of dominance may be crumbling a bit with the appearance of Google+ and the quiet success of LinkedIn.

Other warning signs include:

Declining quality of service. An ongoing, chronic disintegration of the service -- "increasing service disruptions or performance issues", as Hamilton puts it -- is a major red flag. He adds to that, "a general lack of responsiveness to calls or emails."

Declining third-party support. Sites with APIs typically develop a culture of third-party apps -- image uploaders for photo-hosting sites, for instance, or applications that integrate directly into the service, such as Facebook's massive roster of games. If development of such applications has fallen off, that could be a sign the service is losing its user base. If the pace slackens not because of market saturation (you can only have so many photo uploaders) but because of genuine programmer alienation -- to the point where word filters out into the general user community -- that's a bad sign.

Changes in terms of service or arbitrary behaviors. Many people leave a Web service behind not because the service itself is endangered, but because of things the service has done. A common reason for this is changes to the terms of service, which can spark a massive user backlash. It doesn't help that terms of service are all too often pools of mud, where the implications of any changes are unclear unless spelled out with total precision. Think of the recent flap over DropBox's clause indicating it would turn files over to the government if asked -- which forced the company to add wording to the effect that your stuff remains yours and they won't mess with it unless they have no other choice. (In its own words: "These Terms do not grant us any rights to your stuff or intellectual property except for the limited rights that are needed to run the Services.")

1 2 Page
Join the discussion
Be the first to comment on this article. Our Commenting Policies