In a previous post, "Data is the new black," we discussed how data-centricity is fashionable again. This renewed focus on data, triggered in part by architectural style changes, is also caused by a dramatic shift in what “data” represents for the enterprise today.
When enterprise IT was clearly and solely in charge of the organization’s core “information assets,” these could be divided into infrastructure (network, servers, and so on), applications, and databases. Therefore, “data assets,” contained in databases, were managed by IT on behalf of the business units who were producing and consuming this data. These data assets, stored in systems ranging from mainframes to relational databases, would include records of customers, contracts, purchases, products, transactions, and more.
The needs and demands of the business have changed. New types and sources of data are now required.
It can be syndicated data, purchased from a provider (stocks or commodities quotes, demographics) or data provided by one or multiple trading partners. While this data may not differ vastly from traditional data assets in terms of structure and typology, the key difference resides in the fact that the enterprise does not “own” it and therefore has little if no control over it. Data stewardship is therefore very different than it would be for internally owned data. For example, data cleansing/filtering may take place at the ingress point, but there is no opportunity to resolve nonquality issues at their source.
Previously unused data
Often referred to as “dark data,” this data is the byproduct of an existing process. For example, Web surfing or e-commerce produce clickstream/log files, which need to be mined for navigation paths, shopping cart abandonment patterns, and more precious insight on user behavior. Any mobile device leaves a trail of GPS coordinates. Modern industrial equipment contains an array of sensors that indicate the performance and overall condition of the apparatus.
Shifting application paradigms
With the shift toward the cloud and the increasing prevalence of “everything as a service,” control and operation of applications shift from IT to external providers, and so does the methods to access associated data. Modern SaaS vendors provide REST APIs to access data owned by their clients, but the way the data is managed and governed inside the system is up to the vendor.
These are only a few examples of the shifts in today’s data landscape, no longer contained inside the boundaries or the firewall of the enterprise, and with a distributed ownership. This highly decentralized, distributed, and even virtualized mode introduces new challenges for both the business units -- owners of this data or at least of the processes that produce and use this data -- and the IT organization that, despite being more and more often bypassed by the business, still remains the ultimate guardian of information assets.
Even though these assets are no longer just your daddy’s data.
This article is published as part of the IDG Contributor Network. Want to Join?