Attention in the wide universe of databases and content management has been drawn lately to XML and, specifically, XML databases. You’ll get a good indication of the state of XML-based content management technology by examining developments at the ground floor: the XML database libraries that form a base for larger content management applications.
Two such libraries are the targets of this review: the Apache Software Foundation’s Xindice and Sleepycat’s Berkeley DB XML. Both are open source, both are free (although the nature of “free” differs between them), and both provide standards-compliant XML document manipulation. In addition, both are powerful developer tools that place eye-opening XML document storage, query, and retrieval capabilities into the hands of eager programmers.
Apache Xindice 1.0
Apache Xindice began as the dbXML Core project, but the fruit of that labor transferred to the Xindice group sometime after 2001. Xindice’s documentation makes no bones about its intended audience: It will be of interest only to developers in need of a solution for storing and manipulating XML data.
Likewise, the Xindice Web site is clear about the package’s limitations; unlike Berkeley DB XML, Xindice does not deal well with large XML documents. Small-to-moderate documents are best for Xindice, although there’s no precise definition of a “small-to-moderate” XML document -- a megabyte or smaller is probably in the ballpark.
Installation is simple and deposits on your system the Xindice server executable, a command-line tool, documentation, source, and a number of examples. Xindice is written entirely in Java, so you’ll need a JDK 1.3 or greater installed to run the Xindice JAR (Java Archive) file.
The programming interface -- the DB XML API -- is Java as well, but Xindice does not limit itself to the Java language. It is built on a client-server architecture and supports the XML-RPC API, so remote Java clients can access the server, as can clients written in other programming languages.
Xindice arranges its storage in the form of “collections,” and all collections exist within a root instance, “/db.” Think of collections as subfolders in file systems; collections contain “subcollections” to an arbitrary depth. The “files” in this analogy are the actual XML documents. Querying and updating are typically applied collectionwide, although you can adjust the granularity to manipulate individual documents.
Command-line control
Xindice’s command-line tool is a godsend for new users. Experimenting with it provides an excellent introduction to Xindice’s capabilities and will give you a good feel for the programming API when it’s time to turn your attention to development. The command-line tool is also useful for jump-starting your database. The tool creates new collections, feeds XML documents into the collections, and even feeds whole subdirectory hierarchies into Xindice (in which case the subfolders appear in the database as subcollections).
Xindice uses XPath for querying collections and XUpdate for updating them. It would be nice if XQuery were supported, as it provides for much richer querying, but for now XQuery support is an entry on the Xindice team’s to-do list. The command-line tool is a great way to test out XPath and XUpdate expressions, but as of this writing the documentation for it is incomplete and leads one to erroneously conclude that XUpdate is not supported.
| Test Center Scorecard | |||||||
|---|---|---|---|---|---|---|---|
| 20% | 20% | 20% | 20% | 10% | 10% | ||
| Apache Xindice 1.0 | 9 | 8 | 9 | 9 | 7 | 9 |
8.6
Very Good
|
| 20% | 20% | 20% | 20% | 10% | 10% | ||
| Sleepycat Berkeley DB XML 2.0 | 10 | 9 | 9 | 9 | 9 | 10 |
9.3
Excellent
|
This whitepaper explains the terminology and concepts behind Data Replication technologies and establishes some sizing rules through worked examples. Learn the new paradigm in disaster tolerance—protect data anywhere.
Download now »Server virtualization is a popular option for dealing with mounting datacenter costs. Another equally promising approach is the use of an Application Delivery Controller. Citrix NetScaler provides a low-cost way for organizations to reduce their server count and accrue cost savings from a reduction in space, cooling, power and personnel.
Download now »
The emergence of WLANs has created a new breed of security threats to enterprise networks.
Included in HP ProCurve WLAN solutions is security technology that alleviates threats from WLANs through:
* Monitoring wireless activity inside and out of the enterprise
* Classifying WLAN transmissions into harmful and harmless
* Preventing transmissions that pose a security threat to the enterprise network
* Locating participating devices for physical remediation
Effectively address data protection challenges, implementing solutions that help store and protect businesscritical data while cutting costs and improving efficiency and reliability.
Download now »
Sign up to receive Data Management Resource Alerts
