The growing acceptance of the SOA approach to enterprise applications has reopened an old IT wound: the sorry state of data in most enterprises. In the 1990s, the data warehouse and the enterprise repository were trumpeted as the solution for getting the entire enterprise on the same page, but these systems quickly became unwieldy dumping grounds, much like the cavernous Indiana Jones warehouse in which the Ark of the Covenant was stored to keep it safely out of reach.
Today, a new approach — often labeled master data management — is emerging, one that takes a modular, orchestration-based approach to rationalizing data strewn across the enterprise in various formats and repositories.
Similar to a complete SOA deployment, however, a complete master data management effort is a huge undertaking, one that takes years and consumes a lot of resources with marginal interim benefit. “You just can’t shut down the enterprise and do this major business re-engineering,” says Don DePalma, chief researcher at IT consultancy Common Sense Advisory. So what is IT to do?
Increasingly, companies are revisiting a mid-1990s approach — the data mart, now often called a system of record — that fell by the wayside during the data warehouse and repository crazes. Creating a system of record is a good way to start down the path of an enterprisewide master data management system. It helps IT get a handle on key data, making it more available to enterprise users and providing a demonstrable ROI in the process.
Data warehouses and enterprise repositories shunted aside the data mart of yore. In most cases, the data mart couldn’t do the job of being a timely data container, DePalma says. “It was a snapshot, so it was outdated,” he notes. But data marts can now be near-real-time repositories, thanks to a variety of advances in the intervening decade. These include standardization of data exchange around ODBC, JDBC, and SQL 99; increased use of Web interfaces for transmittal of real-time information; and better data management tools from vendors such as Business Objects and Informatica, he says.
The result is the improved data mart, or system of record, built using software from companies such as Business Objects, IBM, Informatica, or Oracle. “This is the next step to getting cleansed, standard information of the key information that is important,” DePalma says.
From point solution to architecture
Nationwide Insurance started a data-rationalization effort two years ago, after the CFO decided that having multiple ledger systems — inherited through multiple acquisitions — was interfering with the company’s ability to see the complete financial picture.
Although the insurer could have considered a broad data architecture effort, it had a culture of departmental independence, so it made more sense to solve a specific need than to convince the organization at large to collaborate for an unclear benefit, says Vikas Gopal, director of enterprise financial applications at Nationwide. “On the financial side, there was a realization of the pain that led to the recognition of the need for consistent data translation, which in turn led to a need for data governance,” he says.
With the CFO’s mandate in hand, IT and business analysts dived into all 240 ledger-related systems and the data stored in each and worked with executive management to decide what the enterprisewide ledger system needed.
After securing that agreement — and defining each piece of data in the ledger — the IT staff could go back to the individual product lines’ ledger systems and create the reports that the enterprisewide ledger required, Gopal says. That enterprise ledger is delivered through an Oracle PeopleSoft ERP system and Kalido data repository.
Because definitions and details about financials were inconsistent, Nationwide’s IT staff had to create data translation and cleansing rules, so the output matched the enterprisewide ledger’s format and definitions. Nationwide uses a commercial ETL (extract, transform, and load) tool from Informatica to perform the data conversion and import into the enterprise ledger.
That tool identifies any mismatches, kicking out the data and alerting IT. That process allows IT to see whether the problem is a technical one, such as a bad report file, or an unauthorized change to the master data format. If a frontline financial system changes how it defines data, IT adjusts the ETL rules to ensure the output matches the enterprisewide ledger’s standards. The standardized reports are generated from a variety of tools, including those from Essbase, Hyperion, IBM, and Kalido.
Beyond the consistent definition of data and the creation of ETL rules to produce data conforming to that standard definition, Gopal says a key change in Nationwide’s data management was the adoption of governance around its data. A committee of IT and business managers decides what the enterprisewide ledger should have, so the data definition and underlying data architecture do not drift in the future. “Before this effort, there was no assurance that the data we received was repeatable,” he says.
Pharmaceutical manufacturer Merck went through a similar process when it decided two years ago to standardize its data around product information. The catalyst was the deployment of an ERP system, which touched multiple departments and exposed data inconsistencies, recalls Joe Solfaro, executive director of information management at Merck. The company’s effort is now focused on understanding the data and developing a data architecture, which involves IT, business analysts, and outside consultants poring through the various databases, software Click for larger view. applications, and business processes.
Merck is focused on understanding its data governance needs — what Solfaro calls “stewardship” — so each data element has a clear owner accountable for its consistency and definition. Although he expects to use some sort of data broker to manage data flow and translation to the standard definitions used by its SAP ERP system, Solfaro’s team isn’t focused on specific technologies right now. Only after the architectural and data definition work is completed in mid-2007 does he expect his team to determine the right technology implementation for ensuring data consistency. “It’s a forensics activity at this point,” Solfaro says.
Don’t lose sight of the vision
While both Nationwide’s and Merck’s efforts are focused on specific projects, they also are keeping an eye on the larger master data management goal.
For example, while Merck’s Solfaro works on the product information effort, another group will work to standardize customer information. Rather than work independently, creating data-architecture fiefdoms, the two groups are coordinating, using common principles in their efforts. That collaboration will make it easier to unify the data architectures and make data exchange much easier, Solfaro says. He expects the project to take a year.
“We will go from the local level to the enterprise level,” Solfaro says, using coordination of individual projects to lay the groundwork. Nationwide’s Gopal also coordinates with other departments with the goal of building toward a standard data architecture. “If you don’t tackle this, you can only work on patch-up projects,” he says.
Both Nationwide and Merck have focused on “one-way” data rationalization efforts, in which data moves along a specified path to a system of record. In an SOA environment, data paths won’t be so linear because services from different systems could be combined to create new functionalities that access data in a nonlinear way. But consultant DePalma encourages IT not to get hung up over this possibility. In many cases, even in the SOA context, there will be clear stages to data’s use and definition.
Customer information related to shipping predictably can be needed for shipping and call-center purposes, for example, so worrying about whether a customer-acquisition system knows what to do with that data isn’t worthwhile. The key is to ensure that the system knows to ignore aspects of the data irrelevant to it — or never sees it in the first place.
Rather than worry about designing a data architecture and data model that accounts for every possible use, DePalma says IT should focus on ensuring that the data model doesn’t become static. Definitions and needs will change, so IT should continually assess its data model and adjust its master data management systems accordingly. “You need structure and an application development model for future development so that new things plug in to your system,” he says.
That’s Merck’s philosophy as well: “From a grander viewpoint, our enterprise architects have already tried to unify the architecture and the strategy,” Solfaro says. “Even if there are tactical differences on projects, we’ll still be moving in the same strategic direction.”