June 13, 2003

IBM sprinkles Cinnamon on Content Manager

Big Blue to bolster XML in forthcoming version

IBM is working on ways to make XML documents and data easier to pull into its content management software, and to index and search the data once it is in there.

The initiative, code-named Cinnamon, currently is under development within IBM's research arm, according to Jim Reimer, chief architect of content management at Somers, N.Y.-based IBM.

"The technology here is against the backdrop of our DB2 Content Manager products. The technology relates to handling XML documents and doing tasks such as automatic ingesting of the documents," Reimer said.

Until now, within content management and IBM's DB2 database the handling of XML documents has been focused on being able to receive XML documents that are set in different DTD schemas and have them be, in effect, mapped into rows in a database; so that is kind of parsing, extraction, and flattening action to be able to take XML documents from different sources and have them be added in with values out of the XML documents populated into certain columns, he said.

"In that context, when content systems are done, it's necessary to use much more complete or complex ways of expressing what's in the collection. One of the ways of gauging the completeness of a content management system is how rich a model you are able to manage for the way in which you are describing the content objects that are in the collection?" he said. "Content systems frequently have much more extensive description methods, like hierarchy and structure, like folders or folders in folders."

In IBM's latest Content Manager, Version 8, the company made extensions to what could be represented in a data collection, such as the primitives, the data modeling services, or whatever can be expressed in an XML document, including multi-valued attribute sets, arbitrary hierarchy, links, and relationships.

"The challenge if you have such documents is how to get them into CM and, secondly, how to deal with the landscape where you have evolving DTDs and schemas over time and different authors, writing in different DTDs and schemas, that are giving you content," Reimer explained.

The underlying technology aimed at this mapping, administration, and adaptation problem of dealing with evolving schemas is a project also within IBM research, dubbed Clio, and part of the overall eXperanto effort.

Cinnamon, then, is IBM's effort to extend that technology base to permit users to take complex XML documents, whatever might be expressed in an XML document and the associated DTDs and schemas, and then manage the oversight of the mapping task that defines how to project that into the full data modeling services of CM. Secondly,  from a runtime perspective the goal is to handle the automatic ingesting of those documents including all the parsing, extraction, and projection into the new data model, Reimer explained.

"It's a key step for being able to improve the productivity of ingesting such documents in that complex of an environment. It's very important also to be able to live with the evolution of those schemas," Reimer said.

Additionally, the Cinnamon effort is a step toward administrative controls in the product that eliminate the need for programming through the ability to automatically ingest content and have it be automatically ingested, projected, and modeled in the same system, he said.

Stephen O'Grady, an analyst at RedMonk, in Bath, Maine, said that IBM, with Cinnamon, is potentially addressing the future problem that companies will have as they collect more and more XML documents.

"There's no question that having documents in XML will be advantageous to companies for a host of reasons, such as indexing and personalization," O'Grady said. "It's going to be a problem because companies will have to really know what they are doing for indexing and retrieval."

O'Grady estimated that major companies will face these issues in approximately a year and a half.

Cinnamon is in what IBM calls the technology preview state, and will come to market as one of the administrative tools included with a future version of DB2 Content Manager, due within next year's timeframe, Reimer said.

InfoWorld Editor at Large Tom Sullivan covers a variety of topics for news and features, as well as produces the InfoWorld Daily podcast.
Close

On Twitter now

Application development

Powered by Twitter

White Paper

D2D Virtual Tape Library Replication Primer

This whitepaper explains the terminology and concepts behind Data Replication technologies and establishes some sizing rules through worked examples. Learn the new paradigm in disaster tolerance—protect data anywhere.

Download now »

White Paper

An Alternative to Virtualization for Datacenter Cost Savings

Server virtualization is a popular option for dealing with mounting datacenter costs. Another equally promising approach is the use of an Application Delivery Controller. Citrix NetScaler provides a low-cost way for organizations to reduce their server count and accrue cost savings from a reduction in space, cooling, power and personnel.

Download now »

White Paper

Why Your Firewall, VPN, and IEEE 802.11i Aren't Enough to Protect Your Network

The emergence of WLANs has created a new breed of security threats to enterprise networks.

Included in HP ProCurve WLAN solutions is security technology that alleviates threats from WLANs through:
* Monitoring wireless activity inside and out of the enterprise
* Classifying WLAN transmissions into harmful and harmless
* Preventing transmissions that pose a security threat to the enterprise network
* Locating participating devices for physical remediation

Download now »

White Paper

Bringing the Edge to the Data Center

Effectively address data protection challenges, implementing solutions that help store and protect business–critical data while cutting costs and improving efficiency and reliability.

Download now »

Sign up to receive InfoWorld Resource Alerts

Subscribe to the Developer World Newsletter

Receive a weekly roundup about the art and science of software development.

©1994-2009 Infoworld, Inc.