How to protect your data when a cloud service vanishes

Cloud services can -- and do -- fail, taking your data with them. Here are some of the dangers of trusting your data to the cloud and how to protect yourself

1 2 Page 2
Page 2 of 2

Different folks have different thresholds of tolerance for such things, so what ticks off your neighbor may not seem as egregious to you. But if you hear about such a thing happening with a service you use, pay attention, and give the ToS a fresh read whenever you're asked to reconfirm your acceptance.

Read the terms of service

Speaking of the terms of service, that's the one part of any service you shouldn't ignore, since it spells out what can and can't be done with your data. It doesn't help that most terms of service are terribly arcane, with crucial points buried within multiple clauses of pure lawyer-speak. Here are several major clauses that appear in a site's ToS, as they affect movement of user data.

Rules about third-party programs. Many sites explicitly disallow the use of unapproved applications designed to scrape or harvest site data, on pain of termination. If you're leaving anyway, this threat isn't quite as weighty, but it might cause trouble if you are relying on such a program to back up your data on a regular basis. These rules often cover the service's stance on data portability -- they may not come out and say that data can't be exported from the service, but they may add rules like this to make it massively inconvenient.

A lot of that is achieved by general vagueness in the wording of these rules. Paragraph 6.j of Yahoo's ToS (which includes Flickr) forbids "disobey[ing] any requirements, procedures, policies or regulations of networks connected to the Yahoo! Services, including using any device, software or routine to bypass our robot exclusion headers," which could conceivably include Web scrapers or other such applications. Most of the time, it would be hard for it to tell that those apps were in use, unless a great many people started using them, a lot of content from an individual user's account was being scraped or the service attempted to detect use of such tools and took steps to block them.

Users who ignore ToS provisions about third-party applications do so at their own risk. "Legally, you could be breaking a term of service or violating copyright laws," notes Hamilton. "Or, if a Web scraper is constantly scraping a site, they could impose performance issues or become the equivalent of a denial-of-service attack."

Reuse of your content. Some sites will have a ToS provision that allows whatever you post to your account to be redisplayed in other contexts. If you see this clause, don't panic, but do read it closely. This clause typically exists for the sake of allowing whatever you post to be shown in promotional material, rotated on the site's home page or just manipulated internally.

Google's ToS, for instance, has this in paragraph 11.1: "By submitting, posting or displaying the content you give Google a perpetual, irrevocable, worldwide, royalty-free, and non-exclusive license to reproduce, adapt, modify, translate, publish, publicly perform, publicly display and distribute any Content which you submit, post or display on or through, the Services. This license is for the sole purpose of enabling Google to display, distribute and promote the Services and may be revoked for certain Services as defined in the Additional Terms of those Services." Many other services retain a similar clause.

As-Is/As-Available. This is another catchall clause that, in effect, means the service has no particular obligation to provide continuous uptime, to protect your data's integrity or even to keep the service active. Note that As-Is clauses may be a bit buried and not broken out into their own section; search on the keywords "As-Is" or "warranty" to find them.

At-will termination. Finally, some terms of service have a clause that states they can pull the plug on your account, just because. Don't be surprised if you see something like this -- it's usually in there as a catchall way to kick people off if they flaunt the rules or consume a disproportionate amount of the service's resources. You may not need to worry about this most of the time, but it may be used to justify booting you off if, for instance, you use an unorthodox or unapproved method to retrieve or mirror your data. Google has this clause in paragraph 4.3 of its ToS; Yahoo's ToS has it in section 15. In both cases, it's worded in an open-ended enough fashion to make it possible for an account with either service to be closed for no apparent reason at all.

Create an exit strategy

If you don't have major qualms about a service you're with but you still want to create an exit strategy, a few basic points are worth keeping in mind.

Keep local copies of everything that's crucial. The only storage you can completely trust is the storage you physically own, so always make sure there's a local copy of everything important. If you've already been trusting your only copies to a site, break the habit now. Any Web service should be thought of as a replicator, not a repository.

For instance, don't ever trust a remote service to your only copy of a given photo, since the service's rule about data preservation might not be in your best interest. Flickr, one of the most popular photo hosting services, doesn't allow you access to the original copy of an uploaded photo unless you have a paid account. A utility like Flump or FlickrEdit can help you extract pictures from your stream, although they will probably not be able to rescue images that aren't publicly accessible. (Flump, in particular, requires a Pro-level Flickr account to be useful.)

On the other hand, many Gmail users have no qualms about leaving their entire trove of mail on Google's servers -- even though both POP3 and IMAP connectivity exist for Gmail, making it not only possible but easy to keep mail local. It's easy to get into the habit of unthinkingly trusting Gmail to always be there -- at least until the next network outage or Google cloud failure.

Practice making local copies of the service's data. If a site has a way to allow you to make a local copy of your data, make a practice run. Step through the process of creating a local copy of the data and see how difficult it is -- how many steps are involved, are third-party tools required and so on.

Also be warned that the process could change on you without warning, so you should take a full review of the process every so often or whenever you get word about major changes to the service.

Keep an eye on what third-party apps are being introduced or removed. If you're depending on a third-party app to help you keep copies of your data, keep in mind that apps can be fickle as well. That app you downloaded six months ago might have since been blocked by the service in question -- or there might be a replacement or even a new (and superior) substitute. In other words, keep up to date and check to make sure your backup mechanism, whatever it is, still works.

Bottom line

Most people don't think much about the inherent closed-endedness that goes with using proprietary Web services, simply because they offer so much in return. That closed-endedness -- and the difficulties involved in porting your data back out -- is becoming increasingly problematic now that such services are so common.

The sad truth of the history of Web services is that any site can disappear, given a long enough time span. But even in the face of such a history, most proprietary Web services still skimp on providing tools to make it easier for users to leave. And why wouldn't they make it difficult, when they have a vested interest in keeping their users? Give people a way to easily switch to a competitor and you've chipped away that much more at the advantages you hold over them.

On the other hand, those who do offer such tools have another advantage: a level of trust with their users that their competition might not have. And given that trustworthiness is becoming a Web currency at least as valuable as ad dollars to some people, it's in any Web service's long-term best interest to start offering those tools. Until then, the rest of us will have to make do with the tools available -- and keep our ears to the ground when rumbling starts. Not if, but when.

Serdar Yegulalp has been writing about computers and information technology for over 15 years for a variety of publications.

This story, "How to protect your data when a cloud service vanishes" was originally published by Computerworld.

Copyright © 2011 IDG Communications, Inc.

1 2 Page 2
Page 2 of 2
InfoWorld Technology of the Year Awards 2023. Now open for entries!