Inside IBM DB2 Viper
A technological marvel, IBM's new XML-powered server aims to change the face of database storage
It’s certainly possible to imagine applications that take good advantage of a hybrid XML/relational data store. A clinical database, for example, might contain a relational patient table with all of the relevant information about a patient, plus a list of allergies stored as XML. This kind of record could be modeled relationally, but using XML is a good way to reduce the number of joins and ease development effort, because you no longer have to maintain relationships between patients and allergies. You could do something similar with orders and order details, where each order stores the line items as XML instead of the classic line-item table.
Click for larger view.
Regarding IBM’s optimizations for XML data, as with any performance increase, you have to ask yourself what it will mean to you and your shop. For tasks such as loading millions of rows into a database, a 7x improvement is a big deal, but for the casual insert statement it just isn't significant. Customers will most likely see improvements in two scenarios: when the database is being pounded by thousands upon thousands of XML inserts, and when the database is loading enormous XML files.
One very interesting feature of the pureXML engine is that it will preserve digital signatures of signed XML files. If you receive a digitally signed XML file, you can load it into the database, retrieve it at any time in the future, and the digital signature will still be intact. Microsoft and Oracle can’t do that; but then again, it isn’t a widespread requirement.
Thus, as cool as it may be, I can’t see pureXML significantly reducing TCO (total cost of ownership). So far, its coolness seems to be mostly technology for technology’s sake. Just because DB2 has some functionality doesn’t necessarily make it the best strategy.
Scaling new heights
Fortunately, DB2’s XML capabilities aren’t the only improvements in the new release. Far from it. Scalability is another area that IBM has given special attention.
For starters, by using a larger record identifier, DB2 9.1 allows admins to create temporary work tables for system and user queries that are much larger than was previously possible. The size of a single table has also been increased to a whopping 1.1 trillion rows or 16TB, whichever comes first. Of course, both of these are quite dangerous. Should you actually create objects this large you’re going to have severe performance problems. Still, if it’s a choice between doing it slow and not doing it at all, you’re better off with what DB2 gives you.
It’s like DB2’s query limit. DB2 allows queries up to 2MB long. So I decided to do an experiment. I pasted a query in Word until it reached 2MB, and the result was somewhere in the neighborhood of 64 pages. While I can’t imagine a single query that long, I suppose it’s useful to somebody. Likewise, if you foresee having more than a trillion rows in your tables, you’re in luck with DB2.