Free Newsletters
Technology & Business Daily

InfoWorld
Log-in | Register

Databases flex their XML

IBM, Microsoft, Oracle, and Sybase compete in our data management gymnastics

By Sean McCown
April 23, 2004
 

If you could do one thing to improve integration and automate processes with customers and business partners, it would be to implement XML, which has become the standard for exchanging information between disparate systems because it is easily transformed into any format. With very little effort, the same file can be sent to several different customers with their own specific needs. XML eases the development effort for the transmitting company and gives recipients a safety net for altering the way they use the data without having to alter how they receive it.

Free IT resource

TechNet: More ways to know it, share it, and keep it running.

Sponsored by Microsoft

Free IT resource

Attend the SOA Executive Forum: Breaking SOA Bottlenecks SOAExecForum.com/may2007

Sponsored by InfoWorld

DOWNLOAD PDF

Click here to download InfoWorld's special report Databases get a grip on XML


Being able to merge, query, and transform transmitted data with relational data is becoming as essential to businesses as data warehouses themselves. The good news is that the four leading relational databases, namely Oracle Database, IBM DB2, Sybase ASE (Adaptive Server Enterprise), and Microsoft SQL Server, not only can store XML data, but they hide much of the complexity of working with XML. Depending on which of these relational databases you use, however, the XML features you will have to work with may be extremely rich or limited in important ways.

What does a fashionable XML database provide? Four basic functions: the ability to consume, store, search, and generate XML. The extent to which the database supports these functions and the methods it uses to accomplish them are what make for a successful implementation of XML in a database.

I examined these four areas in Oracle Database 10g, IBM DB2 Universal Database V8.1, Sybase ASE 12.5.1, and Microsoft SQL Server 2000. I tested how they imported and read XML files, their options for saving the data, their indexing and query capabilities, and their options for creating XML and graded them based on the ease, flexibility, and speed with which they handled the most common XML operations.

Of course, these products have many other capabilities beyond handling XML. My grades should not be interpreted as complete evaluations.

Documents and Tables

Relational databases and XML documents are both powerful ways to represent relationships among data, but they're powerful in different ways. For example, querying on a patient ID number in a relational database may allow you to quickly find the dates a certain patient visited the hospital, the conditions he was diagnosed with, and the treatments he was given. But it likely won't help you determine which treatments were provided for which conditions or what times the treatments took place, nor will it give you other useful information that XML versions of these records could provide.

But whether or not you can combine the benefits of relational and XML data depends on how you store the XML. There are three methods for physically storing XML data in a relational database: shredded, unstructured, and structured. Shredded and unstructured are useful methods but limited. The structured method allows you to leverage the power of both relational data and XML hierarchies.

Shredding puts XML data into relational columns but strips it of its XMLness, meaning the hierarchical relationships among the data in the original XML document are lost. Shredding is useful when you're not concerned about keeping the data in XML format. For example, let's say you have a Web site that allows customers to place orders, and the order needs to go to a number of different database systems. Producing an XML file and having the different systems pick it up -- that is, shred it -- from a network share may be the most efficient and error-free way to get the data where you want it to go.

The unstructured method uses a data type called a CLOB (Character Large Object) to store an entire XML document as a single unit. Databases have been doing this for years with different types of documents, so this is nothing new. The unstructured method provides limited search capabilities, but it is still quite useful. You can't base queries on it, but the structure of the original data is preserved. A good use for unstructured XML storage would be in keeping original documents to comply with government regulations. For example, if a financial institution were to receive original loan documents in XML, this would allow them to have a relational record of each loan application, and also to store the original application with that record.

The structured method allows you to store XML data inside the database and preserve the hierarchy of the data. Structured storage, also known as "native XML" storage, is what every vendor is trying to achieve. The most obvious benefit of preserving the hierarchical relationships of XML data is being able to receive an XML document, combine it or manipulate it with relational data, and produce XML as a result. It isn't possible to produce such result sets with a relational query language alone.


Click for larger view.
Only Microsoft SQL Server 2000, among the four databases tested, does not support structured XML storage. And while all four, including Microsoft, allow you to reap the benefits of the shredded and unstructured methods, the approaches they take and flexibility they provide can differ considerably.

Oracle Sets the Curve

Oracle Database 10g breaks new ground in support for XML technology, offering very rich features for importing, storing, querying, and generating XML data. Providing native, structured XML storage as well as support for unstructured document storage and shredding, Oracle Database 10g allows you to pull XML data from files and merge it with relational data in views. But before jumping into an upgrade for the enhanced XML capabilities, Oracle shops should note that most of the functionality is available in Database 9i.


Continued
1 | 2 | 3 | Next Page » 



 


 
Sean McCown is senior corporate DBA at SourceCorp.
 

TOP NEWS:


»  Microsoft: Don't misunderstand UAC, other Vista features
A Microsoft posting attempted to explain the most 'misunderstood' features of Vista: UAC, Image Management, Display Driver Model, Windows Search, and 64-bit architecture

»  Compuware 2.0 set as rebirth of company
Looking to revitalize, the vendor will evaluate products and focus on business value

»  Google overtakes Yahoo as most-visited U.S. Web site
For the first time, Google has knocked Yahoo off the top spot of the most popular Web site in the country

»  Top 10: HP-EDS buy, Icahn strikes again, China quakes
This week's roundup of the top IT news stories includes the continuing saga of MS-Yahoo, HP's big buy, Vista's developer problem, 3G iPhone rumors, and more

»  ObjectWave's Swan swims for RIA connectivity
Rich Internet application platform enables simpler connectivity between AJAX interfaces and server-side code

»  Bender forms group to promote OLPC's Sugar UI
Sugar Labs, founded by OLPC's former president of software and content, intends to use open source as a tool to promote a learning model




Virtualization: A Step by Step Approach to Success
Your virtual machines can be up and running in a matter of minutes. HP and Citrix have integrated XenServer with HP ProLiant servers and management tools, powered by hardware-assisted Intel Virtualization Technology to enable high- performance, cost-savings solutions for server consolidation and disaster recovery. Sponsor: HP

»  Click here to view this Webcast
  The Data Protection You've Been Looking For
Enterprise data is of supreme importance. If you can't find it quickly, it's worthless. If you lose it, it's a crisis. This IT Strategy Guide explores how to keep your data safe.

»  Click here to download now

- Special Advertising Partners -
WHITE PAPERS
 

» Technology White Papers Library

Technology White Papers by Topic

Technology White Papers E-mail Alert

Find out when the latest white paper is available:
 
 
INFOWORLD MARKETPLACE
 
» BUY A LINK NOW
 

FIND PRODUCTS AND COMPANIES
» COMPLETE PRODUCT GUIDE



TECHNOLOGY INDEX
• Applications
• Application Development
• Security
• Networking
• Wireless
• Platforms
• Hardware
• Data Management
• Storage
• Web Services
• Business
• Telecom
• Professional Services
• Standards

TECH WATCH 


What's the 411 on GOOG-411?
Just as Google has become synonymous with "performing a Web search," 411 is understood to mean "information" -- as in "what's the 411?" I was thus surprised to discover, from a billboard, no less, that the king of search is taking on the ...

Apple HTML source reveals 'iPhone Extreme'
"This one's a stretch..." reports AppleInsider. Um, yeah. Reporting on HTML code sightings of product names could be called a stretch, but iPhone Extreme has a ring to it. Now, that sounds like the product Apple should have released first, rather ...

COLUMNISTS

Unified under law
Ephraim Schwartz's Column and Blog (InfoWorld) - In the litigious world we live in, deploying a unified communications platform in your enterprise could...
» MORE COLUMNISTS

MORE INFOWORLD BLOGS


Open Sources 
Product Management
When I joined MySQL four years ago, there was quite a lot of debate about product management. We didn't actually have ...

Zero Day 
Botnet herders tending smaller flocks
New research backs up the theory that botnet operators are keeping their networks smaller in a continued effort to keep ...



• Advice Line
• Database Underground
• The Deep End
• Enterprise Mac
• Geeks in Paradise
• Grid Meter
• The Gripe Line
• InfoWorld Daily
• Inside IT
• IT Troubleshooter
• ITXtreme
• Open Sources
• ProdBlog
• Real World SOA
• Reality Check
• Security Adviser
• SMB IT
• The Storage Network
• Tech Watch
• Virtualization Report
• Zero Day

ADVERTISEMENT


RESOURCE CENTERadvertisement 

GOVERNMENT IT & POLICY
'If you don't go after the network, you're never going to stop these guys. Never.'
From the State Department, All the News for Inquiring Minds
TechPresident, the Internet Citizenry's New Consensus Taker



Sponsored Technology Links

 
 
 HOME  NEWS  BLOGS  PODCASTS  VIDEOS  TECHNOLOGIES  TEST CENTER  EVENTS  CAREERS  IT EXEC-CONNECT   About | Advertise | Awards | RSS | Contact Us 

Copyright © 2008, Reprints, Permissions, Licensing, IDG Network, Privacy Policy, Terms of Service.
All Rights reserved. InfoWorld is a leading publisher of technology information and product reviews on topics including viruses,
phishing, worms, firewalls, security, servers, storage, networking, wireless, databases, and web services.

CIO :: ComputerWorld :: CSO :: Demo :: GamePro :: Games.net :: IDG Connect :: IDG World Expo
Industry Standard :: IT World :: JavaWorld :: LinuxWorld :: MacUser :: Macworld :: Network World :: PC World :: Playlist