Addressing the data problem

SOA applications may span the enterprise, so they need consistent, application-independent data definitions. That's one big cleanup job

One of the most unloved parts of IT is ensuring the integrity of data. Data comes from multiple sources -- disparate applications, outside partners -- typically with different assumptions about meaning and usage. That has led to difficult, ongoing efforts to rationalize data through transformation, scrubbing, and master-record systems. In the past, the effects of inconsistent data models and metadata could be confined to the interfaces between applications, usually through transformation efforts.

In SOA, however, individual services that make up a composite application may use data from a variety of sources, so the data integrity problem can no longer be neatly contained. SOA requires an underlying data architecture, so no matter where the data originates, the metadata describing it is consistent enough to be understood the same way by all services using it. “At runtime, decisions are made by a set of rules that early on master data to be correct, so core data management and master data management is very important,” says Tata’s Mohanty.

“SOA magnifies your data issues,” says Ed Vazquez, a vice president at the MomentumSI consultancy. “Before, you could paper those over, but not with SOA. A reusable service means reusable data.”

This means different business units must finally agree on standards for things such as customer information, even if no one group uses all of that information, and that IT must focus its data-cleansing efforts on the sources of data rather than just on what makes it into the data warehouse. “That’s why at the same time you are defining the enterprise process model and architecture, you need to identify the entire data and semantic model,” says Tata’s Iyers.

If you have a data mess on your hands, however, remember that cleaning it up will be time consuming, and will probably require long, boring meetings that involve the business side as the details of consistent data representation are ironed out.

Back to intro | Next: Governance, governance, governance