That's precisely what's happening with our data. Personal, corporate, governmental -- it doesn't matter. We're keeping and maintaining way more of it than we can possibly ever use. The fact that an 18GB disk available in 1998 is roughly the same size, weight, and cost as a 2,000GB disk you can buy today is only serving to hide this problem and make us lazy about policing our data growth.
If this problem is such a big deal, what exactly are we supposed to do about it? In our movie analogy, you would eventually be forced to cap your DVD collection before it got completely out of hand and rely on renting or a Netflix subscription. In other words, you've effectively moved your data from your datacenter to the cloud. But if you've done that, you've also married yourself to an often ambiguous set of licensing, access, reliability, and ownership problems. You don't own the movies you're watching, and there's literally no guarantee you'll be able watch any given movie tomorrow night.
Obviously, that's not the kind of risk enterprises can take with their data. The solution, though, is remarkably simple: Keep less data and do a better job of organizing the information you retain.
True, that quite reasonable approach will require a massive end-user retraining effort culminating in a cultural shift away from data hoarding. Structured collaboration and data management tools will need to be implemented and fully utilized to replace piles of unstructured data. Technologies such as deduplication and automated archiving will also go a long way toward controlling mountains of data, but these measures often serve to mask the underlying problem.
The sad truth is that no technological silver bullet exists today that will solve this problem for you. You'd never completely trust an automated system to decide what data you don't need any more. In the end, you need to do it.
That idea is incredibly distasteful. Everyone has better things to do than dig through e-mail or departmental file shares and delete things that are no longer needed. However, my fear is that if the IDC's statistics prove to be accurate over the next five or 10 years, we eventually won't have a choice. By then, it will be a far, far larger problem and take significantly more resources to correct.
I truly hope that advanced archiving, data analysis, and storage tools will continue to evolve in such a way that they can allow us to be blasé about what we keep and still be able to find what we need when we need it. But if I were you, I'd start taking a hard look at constricting data growth and enforcing organizational standards. Just because you buy huge storage resources cheap doesn't mean you should. It may not be very much fun, but it's better than being buried alive.