Free Newsletters
Technology & Business Daily

InfoWorld
Log-in | Register

Managing your content with XML

Daisy and TeXtML CMSes take differing, yet successful, tacks

By Rick Grehan
July 11, 2005
 

Content is the lifeblood of any organization that relies on information. If documents are lost in file cabinets or hidden away on hard drives, the knowledge they carry is buried. But when content is organized and searchable, that information lives on. It does useful work over and over again as it is referenced, consulted, and combined with other information.

Free IT resource

Hear how top CIOs turn change into a competitive advantage.

Sponsored by HP

Free IT resource

Attend the SOA Executive Forum: Breaking SOA Bottlenecks SOAExecForum.com/may2007

Sponsored by InfoWorld

The two CMSes (content management systems) in this review create organized and searchable repositories of digital documents. On first glance, both products appear similar, and, fundamentally, they are. Both, for example, make extensive use of XML. Closer inspection, however, reveals that each is designed for somewhat different uses of content.

Daisy is an open source CMS whose strength is its flexible organization and navigation capabilities. Ixiasoft’s TeXtML is a commercial CMS that takes a more straightforward approach to content organization, but excels at text search.

Daisy 1.3
Daisy is an exceptionally modular system; its designers purposely decoupled its internal organs for greater flexibility. Among those pieces is the back-end database, MySQL. There’s the repository server, which manages the storage and retrieval of documents. The OpenJMS Java messaging service informs applications of updates to the repository. Finally, the Daisy wiki front end provides dynamic, Web-based repository view and access.

The database back end and the repository server are Daisy’s core components. The OpenJMS service is more ancillary: It passes status events to apps that request notification of changes in the repository’s content or structure.

Strictly speaking, Daisy’s wiki component is merely an example of a front end for the repository. Daisy’s creators describe Daisy as a “content management framework,” precisely because it could be used to support other front ends.

Mind you, Daisywiki is not simply a sample application; it’s a fully functioning wiki, complete with a built-in editor, versioning, search pages, PDF publishing, and more.

I installed Daisy on my test system and, with the exception of a problem with Internet Explorer 5.0, I had it running within a half-hour. The installation constructs a small wiki-based Web site populated with an initial Welcome page. The installation includes all the tools for adding new documents, editing existing ones, adding and managing users, and so on. Because all the site’s pages are built from documents in a Daisy repository, Daisywiki is an excellent mechanism for exploring how Daisy works.

Inside Daisy
The internal structure of Daisy’s repository is unusual in that there is none. There are no folders or sub-folders, no collections — just a container in which documents float about like the meat and potatoes of a digital stew. All is not anarchy, though.

First, documents themselves are structured, being composed of parts and fields. A part carries binary data of a specific mime type (RTF information or image data, for example), and a field carries simple data (such as a numeric value, a date, or a string). The structure and allowed content of a document’s parts and fields is defined by the document’s type (which is specified in yet another document). So, all documents within a repository must adhere to one of the defined document types. You can define as many document types as your imagination permits.

Second, a repository includes one or more “navigation documents,” an XML-based specification that defines how users navigate through the repository. There can be more than one navigation document in a repository, effectively allowing you to define multiple repository views. Behind the scenes, navigation documents work their magic by performing a query on the repository. So, for example, one navigation document might arrange the contents by modification date; another, by title.

The Daisy API is a combination of HTTP and XML. In other words, you send commands to the Daisy repository via

HTTP, and those commands are in the form of XML embedded in the HTTP request. Hence, you can control Daisy through just about any scripting language that can “talk” HTTP; you can even handcraft commands by typing in the proper URL. If, however, you’d rather put a more robust API into the repository, Daisy provides a Java wrapper around the HTTP/XML interface.

The DQL (Daisy Query Language) is obviously derived from SQL. A query is a “select” clause, adorned with modifiers for filtering and ordering the results. Whereas in SQL those filters amount to comparisons on column values, in DQL the comparisons are performed on document fields. So, for example, to search for documents in the repository with a PictureContent field equal to “boat,” you would enter the following Daisy query: “select id, name where #PictureContent = ‘boat’.” This returns the ID number and name of the document.


Continued
1 | 2 | Next Page » 



Ixiasoft TeXtML Server

Ixiasoft, ixiasoft.com

Very Good  8.1
criteria score weight
Ease-of-use 8 20%
Flexibility 8 20%
Integration 8 20%
Management 8 20%
Scalability 9 10%
Value 8 10%

Cost:
Starts at around $10,000

Platforms:
Requires Windows 2000/2003,or Windows XP Professional

Bottom Line:
TeXtML provides a wide array of APIs. Setup is easy, and it supports fault tolerance with multiserver fail-over repositories (an optional component). TeXtML strikes the right balance between turning everything into XML, or using XML to enable powerful queries. The CMS excels at text searching. Although the price is steep, TeXtML may well be worth considering for companies that want quick search access to documents in a secure repository.

About our Reviews and Scoring Methodology



Daisy 1.3

Outerthought, org/daisy/index.html

Very Good  8.3
criteria score weight
Ease-of-use 8 20%
Flexibility 9 20%
Integration 8 20%
Management 8 20%
Scalability 8 10%
Value 9 10%

Cost:
Free

Platforms:
Requires only a JVM 1.4.2 or higher, and MySQL version 4.0.20 or Version 4.1.7 (or higher).

Bottom Line:
Daisy's novel approach to stuffing all documents into one bag and leaving it to metadata and navigation documents to sort out may sound like anarchy but this scheme provides more flexibility than the collections approach. Daisy allows multiuser editing of repository content, as demonstrated by its wiki front end. The installation takes a bit of work, and the documentation is still in progress. Daisy is proving itself in live on-the-Web use, so the extra effort is worth it.

About our Reviews and Scoring Methodology



 


 
Rick Grehan is a contributing editor at InfoWorld. Contact him at rick_grehan@infoworld.com.
 

TOP NEWS:


»  Top 10: Intel antitrust redux, AMD change, network woes
This week's roundup of the top tech news stories includes Intel's EC woes, AMD's new CEO, San Francisco's network issues, the ongoing MS-Yahoo saga, and more

»  Why San Francisco's network admin went rogue
An inside source reveals details of missteps and misunderstandings in the curious case of Terry Childs, network kidnapper

»  AMD takes on Intel with its own low-power chip
The chip, code-named Bobcat, is designed for low-cost laptops and mobile devices and will compete with Intel's Atom processor

»  Hold off on WiMax investments, Gartner cautions
Analysts say businesses should wait until WiMax is more widely deployed and there are more dual-mode handsets

»  Samsung, Sun jointly develop NAND flash memory chip
The 8GB single-level cell NAND flash memory chip developed by Samsung and Sun should have a significantly longer lifespan than current flash memory

»  RIM fixes critical BlackBerry Enterprise Server bug
Research in Motion patched a critical bug in its BlackBerry Enterprise Server that could have allowed hackers to break into company networks




Remote Access: Maintain Security and Decrease the Burden on IT
Join this interactive webcast to discover how IT Managers can control access rights, end-user security settings and end-point authorization. Sponsor: Citrix(R) GoToMyPC(R) Corporate

»  Click here to view this Webcast
  Zombie PCs Are Attacking Your LAN
A recent study showed that malware-infected zombie PCs are now a bigger threat to ISPs and Web infrastructure than DoS attacks. As this brand new IT Strategy Guide explains, an increased use of peer-to-peer techniques by the attackers has made it harder to fight back. Download now, compliments of Verio:

»  Click here to download now

- Special Advertising Partners -
WHITE PAPERS
 

» Technology White Papers Library

Technology White Papers by Topic

Technology White Papers E-mail Alert

Find out when the latest white paper is available:
 
 
INFOWORLD MARKETPLACE
 
» BUY A LINK NOW
 

FIND PRODUCTS AND COMPANIES
» COMPLETE PRODUCT GUIDE



TECHNOLOGY INDEX
• Applications
• Application Development
• Security
• Networking
• Wireless
• Platforms
• Hardware
• Data Management
• Storage
• Web Services
• Business
• Telecom
• Professional Services
• Standards

TECH WATCH 


What's the 411 on GOOG-411?
Just as Google has become synonymous with "performing a Web search," 411 is understood to mean "information" -- as in "what's the 411?" I was thus surprised to discover, from a billboard, no less, that the king of search is taking on the ...

Apple HTML source reveals 'iPhone Extreme'
"This one's a stretch..." reports AppleInsider. Um, yeah. Reporting on HTML code sightings of product names could be called a stretch, but iPhone Extreme has a ring to it. Now, that sounds like the product Apple should have released first, rather ...

COLUMNISTS

Unified under law
Ephraim Schwartz's Column and Blog (InfoWorld) - In the litigious world we live in, deploying a unified communications platform in your enterprise could...
» MORE COLUMNISTS

MORE INFOWORLD BLOGS


Open Sources 
Product Management
When I joined MySQL four years ago, there was quite a lot of debate about product management. We didn't actually have ...

Zero Day 
Botnet herders tending smaller flocks
New research backs up the theory that botnet operators are keeping their networks smaller in a continued effort to keep ...



• Advice Line
• Database Underground
• The Deep End
• Enterprise Mac
• Geeks in Paradise
• Grid Meter
• The Gripe Line
• InfoWorld Daily
• Inside IT
• IT Troubleshooter
• ITXtreme
• Open Sources
• ProdBlog
• Real World SOA
• Reality Check
• Security Adviser
• SMB IT
• The Storage Network
• Tech Watch
• Virtualization Report
• Zero Day

ADVERTISEMENT


RESOURCE CENTERadvertisement 

GOVERNMENT IT & POLICY
'If you don't go after the network, you're never going to stop these guys. Never.'
From the State Department, All the News for Inquiring Minds
TechPresident, the Internet Citizenry's New Consensus Taker



Sponsored Technology Links

 
 
 HOME  NEWS  BLOGS  PODCASTS  VIDEOS  TECHNOLOGIES  TEST CENTER  EVENTS  CAREERS   About | Advertise | Awards | RSS | Contact Us 

Copyright © 2008, Reprints, Permissions, Licensing, IDG Network, Privacy Policy, Terms of Service.
All Rights reserved. InfoWorld is a leading publisher of technology information and product reviews on topics including viruses,
phishing, worms, firewalls, security, servers, storage, networking, wireless, databases, and web services.

CIO :: ComputerWorld :: CSO :: Demo :: GamePro :: Games.net :: IDG Connect :: IDG World Expo
Industry Standard :: IT World :: JavaWorld :: LinuxWorld :: MacUser :: Macworld :: Network World :: PC World :: Playlist