Data management vendor Syncsort will announce on Wednesday its entry into the Apache Hadoop community, with plans to enable high-performance data sorts to be used with the open source distributed computing platform.
Instead of using Hadoop's default sort, users could swap in another sorting system via an external sort plug-in capability Syncsort is contributing to the Hadoop open source community. Syncsort also will offer a Hadoop edition of its DMExpress data acceleration software, providing an alternative to the default Hadoop sort.
[ Yahoo's possible spinoff of its Hadoop engineering team could boost competition in the Hadoop marketplace. | Learn how to stay on top of the ever-growing amount of enterprise data with InfoWorld's Data Explosion newsletter. | Follow Paul Krill on Twitter. ]
DMExpress Hadoop Edition features Hadoop Distributed File System connectivity. Users can create jobs via the DMExpress graphical user interface and run them in MapReduce, which is the Hadoop programming model and software framework for writing applications that process large amounts of data in parallel or in clusters.
Hadoop is generally associated with the term "big data," in which users need to analyze terabytes of data. "The interest in Hadoop is growing dramatically and not just in Web-based companies," said Keith Kohl, Syncsort director of product management for data integration. Companies in the financial services and telecommunication spaces also are using it, he said.
An early user of DMEpxress Hadoop edition said it offered a performance boost. "[SyncSort is] very adept at providing highly efficient and scalable sort," said Mike Brown, CTO at comScore, an Internet ratings service. "It is much faster than what you would get out of the box with Hadoop."
The plug-in has been tested with the Cloudera Hadoop distribution. DMExpress Hadoop Edition will be available in a beta release this June, with general availability planned for later this year.
This article, "Syncsort offers Hadoop data sort alternative," was originally published at InfoWorld.com. Follow the latest developments in business technology news and get a digest of the key stories each day in the InfoWorld Daily newsletter. For the latest developments in business technology news, follow InfoWorld.com on Twitter.
Read more about storage and managing enterprise data in InfoWorld's Data Explosion Channel.