Can metadata save us from cloud data overload?

Though many call for metadata to provide better approaches to data management in the cloud, there are no easy answers

It's clear that the growth of data is driving the growth of the cloud; as data centers run out of storage, enterprises spin up cloud storage instances.

Indeed, IDC's latest research of the digital universe suggests that data volumes continue to accelerate. Recently Paul Miller at GigaOm chimed in on this issue: "While the cost of storing and processing data is falling, the sheer scale of the problem suggests that simply adding more storage is not a sustainable strategy."

[ Get the no-nonsense explanations and advice you need to take real advantage of cloud computing in InfoWorld editors' 21-page Cloud Computing Deep Dive PDF special report. | Stay up on the cloud with InfoWorld's Cloud Computing Report newsletter. ]

As Miller points out, metadata is one way to address data growth. The use of metadata allows users to effectively curate the bits and bytes for which they are responsible. But this may be a bit more difficult in practice than most expect.

The idea is simple. Much of the growth in data is due to a less-than-comprehensive understanding of the core data of record and, thus, the true meaning of the data. As a result, we're compelled to store everything and anything, including massive amounts of redundant information, both in the cloud and in the enterprise. Although it seems that the simple use of metadata will reduce the amount of redundant data, the reality is that it's only one of many tools that need to be deployed to solve this issue.

The management of data needs to be in the context of an overreaching data management strategy. That means actually considering the reengineering of existing systems, as well as understanding the common data elements among the systems. Doing so requires much more than just leveraging metadata; it calls for understanding the information within the portfolio of applications, cloud or not. It eventually leads to the real fix.

The problem with this approach is that it's a scary concept to consider. You'll have to alter existing applications, systems, and databases so that they're more effective, including how they use and manage information. That's a systemic change, which is much harder and riskier to do than spinning up a cloud server or adding more storage. But in the end, it addresses the problem the right way, avoiding an endless stream of stopgaps and Band-Aids.

This article, "Can metadata save us from cloud data overload?," originally appeared at InfoWorld.com. Read more of David Linthicum's Cloud Computing blog and track the latest developments in cloud computing at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.

Join the discussion
Be the first to comment on this article. Our Commenting Policies