Test Center review: Open source data aces
Jitterbit 2.0 impresses with easy GUI for mastering migration projects, while Talend Open Studio 3.0 scales gracefully to meet enterprise integration demandsFollow @infoworld
The Business Modeler component -- a nice touch for Talend -- is a piece of the puzzle often omitted even at the commercial level. The Business Modeler provides a pallet of components that allow nontechnical analysts to build a view of the system and its workflows, without ever touching a drop of Java. The result gets turned over to developers, who flesh out the details using the Job Modeler, an Eclipse-based IDE and debugger.
The Job Modeler will put any Eclipse-seasoned developer at ease with its own pallet of drag-and-drop components. It also provides access to the central repository, which holds all of your organization's business models, job designs, metadata, documentation, and connection-specific information.
The latest version of Job Modeler adds collapsible subroutines for easier navigation. Other niceties include quick tabbing between graphical layout and code, a job scheduling interface (that puts a GUI on the Unix crontab command), and a thumbnail overview for easy navigation of large document layouts.
I liked the tMap component for defining my transforms and data routings. Although it was reminiscent of an old switchboard with wires strewn about, it was ultimately fast and effective. An Automap option saves time setting up initial connections.
The Job Modeler IDE’s graphical SQL editor and test facility, called SQLBuilder, helps with SQL chores. Talend generates native SQL code for every supported database, no additional effort required. XSLT and XPath are in tow for XML processing. And a good set of orchestration components makes long-running and staged processing a possibility.
Onboard debugging offers step-by-step trace and variable inspection, with real-time stats and trace data viewable directly from the layout. Other niceties, like auto- generation of HTML documentation, sweeten the offering.
You need to be able to trust the accuracy of your data, not just push it around. Talend has data governance covered with good provisions for data quality and profiling. Data conformity and consistency, beyond de-duplication, is achieved using filters such as search-and-replace, interval- and fuzzy- matching, and schema-based transformation. The profiler adds metrics on data quality -- tracked and assessed over time -- and graphically depicts stats and performance summaries for quick isolation of data in need of scrubbing.
I was impressed by Talend's rich set of components for third-party products, too. Support ranges from the higher end of OLAP cubes and Microsoft AX Server, down to QuickBooks and Google Apps. Even open BI solutions, including Jaspersoft and SpagoBI, as well as CRM apps, including Salesforce.com, Sugar, and Centric CRM, are supported.
Talend needs to work on automating management and partitioning of distributed jobs. I’d like to see Talend (and the Talend community) generate more industry-specific components -- say, to address HIPAA (Health Insurance Portability and Accountability Act) and SWIFT (Society for Worldwide Interbank Financial Telecommunication) directly. And although Talend offers ELT support in addition to ETL, currently ELT mode is limited to Oracle, MySQL, and Teradata databases.
Support is always a key concern for open source. Although Talend is still a young company, its worldwide presence enables it to deliver service, support, and training 24/7. Support is included in its team-oriented Integration Suite, along with added provisions for distributed monitoring and load balancing down to the CPU core. Talend even offers a free SaaS edition, Talend on Demand, with subscription-based support.
Clearly Talend has much to offer. Before you break the bank for a six-figure proprietary alternative or ponder the ongoing maintenance nightmare of a hand-coded solution, you’d be foolish not to explore Talend for your next data integration project.