Test Center review: Open source data aces
Jitterbit 2.0 impresses with easy GUI for mastering migration projects, while Talend Open Studio 3.0 scales gracefully to meet enterprise integration demandsFollow @infoworld
On the plus side, Jitterbit projects can be encapsulated and exported into Jitterpaks to simplify migration among dev and production systems. Jitterbit even operates a trading post for its community to buy and sell prepackaged Jitterpak solutions.
Jitterbit can't be considered a full-blown integration platform -- yet. However, despite its shortcomings, I found Jitterbit to be very good at what it does best -- namely, application data migration. Its transformation tools, though basic, are good, and its repository encourages best practices and reuse. If you’re looking to push batch data around, you should consider Jitterbit to alleviate the headaches that frequently complicate -- and delay -- even seemingly simple migration projects.
Talend Open Studio 3.0.1: The real deal
Talend has developed a holistic integration platform from the ground up in a very short time. If the company continues on its current trajectory, it could do for data integration what open source has already accomplished for servers and databases.
New features in Version 3 go a long way toward bolstering enterprise viability. In addition to a native SAP connector (extract and sync), developers will appreciate component search, an ecosystem overview of projects, change impact analysis, and drag-and-drop metadata.
Perhaps most important, Talend has added change data capture (specifically, via slowly changing dimensions). Change data capture enables real-time updates that significantly reduce the size of data transfers -- an increasingly important efficiency measure for data sets that have grown so large, there's no longer enough time to complete batch runs in the overnight hours.
What I really like about Talend is its code-generating approach -- a practice that fell by the wayside in favor of higher-level, user-friendly tools built around a centralized, proprietary engine. Although the proprietary "black boxes" often help streamline development, they can also lead to processing bottlenecks and scalability issues.
By contrast, Talend jobs can be packaged up and deployed anywhere a Java Virtual Machine or Perl interpreter can reside. Jobs can also be embedded directly into your Java apps or even encapsulated as REST/SOAP Web services via easy export.
Not that Talend is suitable to every enterprise project. It’s light on the connectors to mainframes and minis that you'll find in commercial products such as ETI Solution V6, a comparable code-generating solution that can output native code in Java as well as Cobol, C/C++, and SAP.
Open source competitor Pentaho Data Integration (Kettle), despite taking a black-box approach, does offer good control over distributed processing, as well as integration into a more elaborate set of tools for BI and EAI. Nevertheless, I prefer Talend; it’s better developed and more extensible than Kettle, and it offers superb data governance.
Deploying the pieces of Talend Open Studio -- namely Job Designer, Business Modeler, and the repository manager -- is straightforward. I installed to a Windows Server 2003 platform with Sun JVM and ActiveState Perl, and was quickly off and running. (ActiveState, incidentally, has a great new rev of Komodo IDE and Perl dev tools that are worth a look.)