Content is the lifeblood of any organization that relies on information. If documents are lost in file cabinets or hidden
away on hard drives, the knowledge they carry is buried. But when content is organized and searchable, that information lives
on. It does useful work over and over again as it is referenced, consulted, and combined with other information.
The two CMSes (content management systems) in this review create organized and searchable repositories of digital documents.
On first glance, both products appear similar, and, fundamentally, they are. Both, for example, make extensive use of XML.
Closer inspection, however, reveals that each is designed for somewhat different uses of content.
Daisy is an open source CMS whose strength is its flexible organization and navigation capabilities. Ixiasoft’s TeXtML is
a commercial CMS that takes a more straightforward approach to content organization, but excels at text search.
Daisy 1.3
Daisy is an exceptionally modular system; its designers purposely decoupled its internal organs for greater flexibility. Among
those pieces is the back-end database, MySQL. There’s the repository server, which manages the storage and retrieval of documents.
The OpenJMS Java messaging service informs applications of updates to the repository. Finally, the Daisy wiki front end provides
dynamic, Web-based repository view and access.
The database back end and the repository server are Daisy’s core components. The OpenJMS service is more ancillary: It passes
status events to apps that request notification of changes in the repository’s content or structure.
Strictly speaking, Daisy’s wiki component is merely an example of a front end for the repository. Daisy’s creators describe
Daisy as a “content management framework,” precisely because it could be used to support other front ends.
Mind you, Daisywiki is not simply a sample application; it’s a fully functioning wiki, complete with a built-in editor, versioning,
search pages, PDF publishing, and more.
I installed Daisy on my test system and, with the exception of a problem with Internet Explorer 5.0, I had it running within
a half-hour. The installation constructs a small wiki-based Web site populated with an initial Welcome page. The installation
includes all the tools for adding new documents, editing existing ones, adding and managing users, and so on. Because all
the site’s pages are built from documents in a Daisy repository, Daisywiki is an excellent mechanism for exploring how Daisy
works.
Inside Daisy
The internal structure of Daisy’s repository is unusual in that there is none. There are no folders or sub-folders, no collections
— just a container in which documents float about like the meat and potatoes of a digital stew. All is not anarchy, though.
First, documents themselves are structured, being composed of parts and fields. A part carries binary data of a specific mime
type (RTF information or image data, for example), and a field carries simple data (such as a numeric value, a date, or a
string). The structure and allowed content of a document’s parts and fields is defined by the document’s type (which is specified
in yet another document). So, all documents within a repository must adhere to one of the defined document types. You can
define as many document types as your imagination permits.
Second, a repository includes one or more “navigation documents,” an XML-based specification that defines how users navigate
through the repository. There can be more than one navigation document in a repository, effectively allowing you to define
multiple repository views. Behind the scenes, navigation documents work their magic by performing a query on the repository.
So, for example, one navigation document might arrange the contents by modification date; another, by title.
The Daisy API is a combination of HTTP and XML. In other words, you send commands to the Daisy repository via
HTTP, and those commands are in the form of XML embedded in the HTTP request. Hence, you can control Daisy through just about
any scripting language that can “talk” HTTP; you can even handcraft commands by typing in the proper URL. If, however, you’d
rather put a more robust API into the repository, Daisy provides a Java wrapper around the HTTP/XML interface.
The DQL (Daisy Query Language) is obviously derived from SQL. A query is a “select” clause, adorned with modifiers for filtering
and ordering the results. Whereas in SQL those filters amount to comparisons on column values, in DQL the comparisons are
performed on document fields. So, for example, to search for documents in the repository with a PictureContent field equal
to “boat,” you would enter the following Daisy query: “select id, name where #PictureContent = ‘boat’.” This returns the ID
number and name of the document.