There are several nice touches in this installation script, by the way. As I plowed along looking for a good distribution, the software was careful to remember all of my inputs, so it wouldn't need to be reconfigured each time. This should be useful in a cloud where people may try to spin up a cluster, then tear it down. The software also includes a number of little features, like the ability to remember a different root password for each node; these can be quite helpful.
The center of the IBM tool is a console that helps you set up some jobs and kick them off. It's completely browser-based -- like the install script -- and you can simply upload your JAR files directly through the Web browser. You can even drill down into the HDFS file system layer and read the results without leaving the browser.
The Web GUI is a big advance over using the command line, but I easily found a number of ways that the console in the basic edition could be improved. As far as I can tell, there's no way to delete the old jobs. The information for each job includes basic details about the start and stop time, but almost everything else is just dumped as raw text. It wouldn't be too hard to parse some of this and do a nicer job displaying the log information.
The monitoring is also rudimentary. You can see that the nodes in your cluster are running and the components have started, but you don't get any cool dials or widgets that show the load or the progress. If you ask for the "details" about a component, you get a popup with some Log4J lines related to that component. A Java programmer won't blink an eye, but others might find it spare and uninviting.
There are a number of better tools in the enterprise edition. The aforementioned BigSheets, a so-called spreadsheet running on Hadoop, will let you play around with the data in the Hadoop cluster just as you would experiment with the data in Excel. There's also a number of tools for connecting your cluster with other databases and data sources throughout the enterprise. The basic edition is good for trying out a pretty standard version of Hadoop, while the enterprise edition adds a slew of features that go far beyond the open source core.
MapR M3 and M5
Whereas Cloudera is run by folks who come from Hadoop strongholds such as Yahoo, MapR's corporate team is filled with people who hail from Google, EMC, Microsoft, and Cisco, companies with plenty of experience with big data sets, even if they're not steeped in Hadoop's traditional way of working with them.
The new talent is also bringing more sophistication to the stack. The MapR distribution of Hadoop includes a better version of the file system with snapshots, mirroring, and direct NFS access if you need it. MapR also offers a more resilient architecture that won't go down if the central controller locks up. MapR calls all of this "high availability" and charges for it.
MapR comes in two flavors: M3 and M5. Is there an M4? Apparently not, but that's marketing for you. The real distinction is between the free community edition (M3) and the proprietary version with all the extra, high-availability features (M5). While some of the other companies are effectively selling tools for monitoring and reporting, MapR is selling a more sophisticated layer under the hood. In other words, whereas the others are wrapping more features around the open source Hadoop, MapR is rebuilding it.
Having trouble installing and setting up Win10? You aren’t alone. Here are many of the most common...
It's all about knowing how to build an open source community -- plus experience running applications in...
Win7 Update scans got you fuming? Here’s how to make the most of Microsoft’s 'magic' speed-up patch
Sponsored by Hewlett Packard Enterprise
Your iPad can largely function like a laptop with two of the three main office productivity suites from...
The internet has your number—among many other deets. Prevent identity theft and doxxing by erasing...
Kubernetes allows you to deploy cloud-native applications anywhere and manage them exactly as you like...
Having trouble installing and setting up Win10? You’re not alone. Here’s how to get started, recover...