Product review: Denodo brings old-school polish to new-fangled mashups
Denodo Platform marries sophisticated tools for working with relational databases and smart tools for importing data from Web, e-mail, and other unstructured sourcesFollow @peterwayner
At the same time, the product does feel like it's grown a bit shaggy with all of these clever additions, leaving us with a nomenclature that seems a bit complex. Most data from traditional sources comes from the VDP, but many Web-based sources are scraped by the ITPilot collection for reaching out to Web sites and pulling them into the service. The data ends up in the Aracne indexing and search engine. All of the names for tendrils get a bit confusing, and it might make sense to produce a unified naming convention if it could be done in a way that wouldn't annoy the existing users looking to upgrade.
The ITPilot layer is elaborate and powerful, offering a pool of browsers that will suck in information on a schedule. The data comes in as HTML and leaves as entries in a local database. Much of the work is specified using a visual programming language filled with icons for tasks like looping through a set of <table> tags or extracting data and filling up a record. Even though this is a visual programming language, filling out the tasks for some of the icons is complicated enough that it requires a wizard.
Forms and function
I found the visual programming experience to be only a small step beyond using an old-fashioned text language. In fact, the visual programming language for scraping Web sites is just the first layer. Each icon is configured with a wizard.
This may be a personal feeling, but I find text easier to understand than the sea of icons with lines going between them. The wizards for configuring the icons are a big help, but they can only organize the fundamental complexity of the problem, not make it disappear. We still need to be able to tweak features such as the maximum number of times a browser will retry a connection, so we either fill out a form or just type in text.
These forms offer more labels that are usually helpful, but there's an incredible amount of clicking and paging that enters the mix. Steve Jobs is proud of the little remote control that comes with his machines because it only has six buttons, but the price of the simplicity is layers and layers of menus. The layers and layers of icons and menus have the same kind of success/failure. After a few hours, I wished there were some way to keep the handholding of the wizards while moving to a simple, text-based language that would produce more concentrated descriptions.
The best feature of the system is that many of the forms include "test" buttons. There are hundreds of them, and they let you test your entries to a form immediately without rebuilding the project and getting it running. This can be a big timesaver because many parts of the system don't need to be up and running before one part can be tested. I like being able to test my database URLs without waiting.
Simple and simpler
Denodo also includes a much wider collection of data filters than some of the other systems. You can work with semi-structured data from Web sites and other places, or even unstructured data such as pure text. The system includes the Lucene search engine, which offers a wide range of the standard text searching operations.
Your opinion of all of these layers and windows will probably depend upon how many mashups you've done. In theory, the idea of mixing together databases sounds simple. Instead of JOINing two tables in the same database, you just need to JOIN two other tables. It is rarely that simple, though, because the data often doesn't line up correctly. First names may be replaced with an initial. Dates may be in a different format.
I ended up thinking of Denodo as more of a collection of useful tools for a mashup engineer than a magic set of tools. Once you learn your way around the acronyms, you can link up the data sets fairly quickly, as the labels and wizards offer some shortcuts. But they're just a relatively thin veneer that makes it a bit easier to work with the complexity. The rows and columns filled with data are still out there, and they still need to be aligned correctly. These wizards and test buttons simplify the process, but they can't make it as simple as anyone would like.