LexisNexis has worked for more than a decade to develop a large scale system for big data manipulation, and it believes that it has produced something that's better and more mature than the better known Hadoop technology.
The company just needs developers to agree.
[ Keep up on the day's tech news headlines with InfoWorld's Today's Headlines: Wrap Up newsletter. ]
LexisNexis developed the parallel processing data platform to handle the demands of its own data intensive research business. It wants it extend use of the technology, dubbed HPCC Systems, to broader markets, but is clearly aware that open source Hadoop has already established itself as a strong presence.
The company says there are now about 1,000 HPCC Systems developers worldwide, most of who have been trained since the platform was opened sourced in June,
By contrast, a Hadoop developer conference last summer drew a crowd of some 1,700.
To help demonstrate its capabilities, a Terasort benchmark was run to compare HPCC against a similar benchmark and workload by SGI on a Hadoop cluster, announced in October.
LexisNexis says its benchmark was 25 percent faster, and ran on far less hardware: A 4-node cluster versus a 20-node cluster on the SGI system. The LexisNexis test was done on a Dell PowerEdge, two socket servers, with six core Intel Xeon processors.
Flavio Villanustre, vice president of infrastructure and products at LexisNexis Risk Solutions, credited the test results, in part, on the number of lines in code needed for the sorting versus Hadoop.
It took three lines of ECL code to do the sorting, compared to 100 plus lines in Java, which is what is used in Hadoop, said Villanustre.
Asked to respond to the HPCC benchmark, an Bill Mannel, vice president of product marketing at SGI said in a statement that "there are many variations of distributed processing which can run Terasort. HPCC Systems is running Terasort on ECL code, which is different than SGI running on a MapReduce-based Hadoop. SGI remains committed to pushing the bar on performance and beating and improving our own record." MapReduce is a software framework.
Villanustre believes HPCC could do well in the marketplace against Hadoop, but he doesn't take anything for granted. He said that he wants to avoid ending up like Betamax, which lost the video format wars to VHS, or IBM's OS/2 operating system, which was cruushed by Microsoft Windows.
"We want to ensure adoption and that's why we are pushing so much," said Villanustre.
The company has also made its HPCC system available in the cloud via Amazon Web Services.
The platform is available through a dual licensing strategy that allows a community edition and a commercial enterprise platform.
Matt Aslett, an analyst at The 451 Group, believes LexisNexis can be a lot more aggressive "given the large and growing ecosystem of developers and vendors that has formed around Apache Hadoop."