If you could do one thing to improve integration and automate processes with customers and business partners, it would be to implement XML, which has become the standard for exchanging information between disparate systems because it is easily transformed into any format. With very little effort, the same file can be sent to several different customers with their own specific needs. XML eases the development effort for the transmitting company and gives recipients a safety net for altering the way they use the data without having to alter how they receive it.
Being able to merge, query, and transform transmitted data with relational data is becoming as essential to businesses as data warehouses themselves. The good news is that the four leading relational databases, namely Oracle Database, IBM DB2, Sybase ASE (Adaptive Server Enterprise), and Microsoft SQL Server, not only can store XML data, but they hide much of the complexity of working with XML. Depending on which of these relational databases you use, however, the XML features you will have to work with may be extremely rich or limited in important ways.
What does a fashionable XML database provide? Four basic functions: the ability to consume, store, search, and generate XML. The extent to which the database supports these functions and the methods it uses to accomplish them are what make for a successful implementation of XML in a database.
I examined these four areas in Oracle Database 10g, IBM DB2 Universal Database V8.1, Sybase ASE 12.5.1, and Microsoft SQL Server 2000. I tested how they imported and read XML files, their options for saving the data, their indexing and query capabilities, and their options for creating XML and graded them based on the ease, flexibility, and speed with which they handled the most common XML operations.
Of course, these products have many other capabilities beyond handling XML. My grades should not be interpreted as complete evaluations.
Documents and Tables
Relational databases and XML documents are both powerful ways to represent relationships among data, but they're powerful in different ways. For example, querying on a patient ID number in a relational database may allow you to quickly find the dates a certain patient visited the hospital, the conditions he was diagnosed with, and the treatments he was given. But it likely won't help you determine which treatments were provided for which conditions or what times the treatments took place, nor will it give you other useful information that XML versions of these records could provide.
But whether or not you can combine the benefits of relational and XML data depends on how you store the XML. There are three methods for physically storing XML data in a relational database: shredded, unstructured, and structured. Shredded and unstructured are useful methods but limited. The structured method allows you to leverage the power of both relational data and XML hierarchies.
Shredding puts XML data into relational columns but strips it of its XMLness, meaning the hierarchical relationships among the data in the original XML document are lost. Shredding is useful when you're not concerned about keeping the data in XML format. For example, let's say you have a Web site that allows customers to place orders, and the order needs to go to a number of different database systems. Producing an XML file and having the different systems pick it up -- that is, shred it -- from a network share may be the most efficient and error-free way to get the data where you want it to go.