Working with data across an enterprise -- especially in an SOA environment -- requires understanding its context and semantics, not just its format and field attributes. And that means metadata. For developers as well as services to track that metadata, a repository would be useful. Theoretically, they would provide the intermediary services, but with today’s technology, “this is just too ... hard to do,” says Paul Patrick, chief architect at BEA Systems. “No one has assembled the pieces yet.”
The metadata repositories in use today tend to be part of ETL (extract, transform, load) and business intelligence systems, says William McKnight, senior vice president of data warehousing at consultancy Conversion Services International. “Standalone repositories are complex, mainframe-oriented, very expensive, and not integrated with modern tools,” he says.
“Previous efforts at a metadata repository were a debacle,” says Don DePalma, president of the Common Sense Advisory consultancy. In addition to the high licensing costs, “the work to create an encyclopedia of all applications, with its indeterminate benefit, was too high,” he says. Not only were the tools “too academic,” they asked developers to adhere to very formal processes and methods at a time when “all of this formalization went out the window with the move to HTML” and the quick-and-dirty development of the early Web period, DePalma says.
But vendors are now revisiting the metadata repository concept. Some are incorporating the technology in their information management tools. For example, Xcalia uses an XML table-based metadatabase in its Intermediation Platform, which allows IT to create metadata-based transformation rules so services can interact with data sources in a consistent way that is mindful of the data’s context and semantics. The company hopes to develop a stand-alone metadata repository that allows these rules to be used by multiple applications, says Eric Samson, Xcalia’s CTO. And Informatica uses a metadata repository in its PowerCenter Data Federation data-integration platform, notes Ashutosh Kulkarni, Informatica’s principal product manager.
Other vendors, including BEA Systems and IBM, are also working on less expensive, easier to implement metadata repositories. “The master data management has to be in a repository, whether the architecture is federated, distributed, or centralized,” says Dan Drucker, director of enterprise master data solutions at IBM.