Master your terms before you master your data

As master data management becomes a business imperative, IT needs to make sure every department is on the same page

When it comes to aggregating data from dozens of systems to create master data files, nothing is more important than ensuring everyone in the your organization has what I would call a "master definition file."

There is no shortcut to master data management without it.

To be clear, I'm not talking about the kind of file cleanup you do when creating a data warehouse. Those kinds of definition files already exist, and they are more about making sure customer names are the same across the multiple systems that dump data into the warehouse. What I'm talking about is creating a common business glossary and common definitions across business and IT.

[ How important is data cleansing and validation? Read these tales of horror, and beware the perils of dirty data ]

This essential of master data management comes to mind because IBM, one of the premier middleware software companies, pitched me recently on what it calls "information middleware."

I spoke with Tom Inman, vice president of information-on-demand acceleration, and Michael Curry, director of product management and strategy for IBM InfoSphere, about IBM's approach to managing information.

As IBM defines it, "information middleware" is geared toward creating, managing, and delivering trusted information throughout the enterprise. It goes beyond establishing a business glossary, IBM says. But before any organization can even approach master data management, business users must have a set of agreed-on definitions for terms used throughout the organization.

The importance of agreement on terms

Suppose you work for a banking institution. Your organization will define risk according to a certain set of criteria. But what if your company merges with a life insurance company where age is a key metric to measure risk? The newly formed organization needs to agree on a new definition for risk.

Or consider planning and forecasting, for example. What is a lead? Who is a high-value customer? Definitions often vary from department to department.

Say, the manager of a retail clothing store decides that a customer who buys $10,000 worth of clothes a year is a high-value customer. But what if the returns department knows that this particular customer returns $7,000 worth of clothes per year and considers that same customer a fraud risk?

Having accurate information is critical.

What does "discharge date" for a patient mean? Is it when they walk out the door, or when the doctor signs off?

The key issue here is understanding information in context, says Judith Hurwitz, principal at Hurwitz & Associates. The tie department has its own data sources tied to its world view, including colors, sizes, and customers who buy 10 ties at a time. Then there's another department that sells another product. All are involved with customers at some level, but each one has its own business context.

Metadata: The metrics of context

In health care, if you're measuring operational efficiency and using an assumption that patient discharge is defined as the day the patient walks out the door but in fact half of your hospitals define it as the time the doctor signs off, there may be incremental costs between these two events that you as a hospital administrator may not factor into your budget.

Achieving alignment gives you higher-quality metrics and more control over your business, IBM's Curry says. Once you have that, you can make more informed decisions.

What Inman and Curry are really talking about is metadata There's the data itself -- how many suits and ties Customer A buys -- and then there's the metadata, where you wrap a business definition around the definition of "valuable customer" using all of the metrics to rate that customer.

In conversations with CIOs across the country, Henry Morris, senior analyst at IDC, notes that the idea of application and data consolidation is top of mind. In order to do the kinds of things IBM's Inman and Curry are suggesting, however, IT will need to make sure everyone is on the same page. Without attending to that first, consolidation can turn your once smooth-running but heterogeneous environment into a homogeneous mess.

There's no magic bullet to solve this issue, Hurwitz says. Companies need to establish a strategy that allows its executives to look at data across all the business units and channels and partners so they can understand what it all means.