Innovation, such as that required to create and deploy BA (business analytics) solutions, is generally an easier process for smaller, focused development groups. So I’m seriously impressed by what SPSS has been able to accomplish in the BA tool area with the newest version of its data mining workbench, Clementine 8.1.
Given SPSS’ role in the market, I expected a more pro-forma approach — the two behemoths of statistical analysis, SPSS and the SAS Institute, dominate the user base for sophisticated BI and data mining applications. I was pleasantly surprised by the attention SPSS paid to both usability and breadth of features, aspects that big companies with large installed bases tend to cut corners on.
Clementine 8.1 has a sensible design and eminently practical user interface. The BA features neither degrade what’s already there nor disappear into the massive capabilities that anchor the Clementine data mining product family.
The underlying workbench design uses a graphical representation of the analyst’s own process workflow. The data mining workflowrequires formulating the right cluster of questions to ask, identifying a subset of data from the warehouse or mart that addresses the questions, cleaning and restructuring the data, loading it, running it iteratively until you have a predictive model, and then saving the work for reuse.
Clementine supports all of this work except the purely human-expertise task of creating the right set of questions. That makes the goal of the data mining client — attacking large stores of collected data and pulling out meaningful relationships that hint at or even sometimes scream out actions to take — easier to achieve. For shops already committed to SPSS infrastructure, choosing Clementine is a no-brainer; for those with mixed platforms, Clementine’s virtues make it a very strong choice.
Going graphical
Clementine’s tabbed tools palettes sequentially collect related steps in the workflow process, grouping them into “nodes.” An analyst-user drags these nodes to the work window, connecting them in a structured, graphical sequence to create workflows that SPSS calls “streams”; multiple, related streams form a project. Clementine maintains a logical structure to manage these work products, with tabbed storage areas to store and display them. Users may also draw from previously created work modules.
In its tersest expression, a stream need consist only of a data source node, a process node, and some deliverable, either a model or a graphical output. In reality, analysts will export the models and procedures to one of the many output formats Clementine supports, including SPSS, SAS, and SQL. And they’ll use the tools to put a significant slab of the data preparation back into the database so the work needn’t be re-executed in future data mining.
This workflow diagram model is eminently practical because it follows the standard professional analyst’s structure, and because the analysts trained for these positions tend to have mastered this form of structured thinking. This makes Clementine’s face to the user a gloriously productive one. The tabbed palettes of nodes are organized in a way that dedicated analytical pros will “get” instantly, and those who do a range of work, including analysis, will pick it up quickly.
| Test Center Scorecard | |||||||
|---|---|---|---|---|---|---|---|
| 20% | 20% | 20% | 20% | 10% | 10% | ||
| Clementine 8.1 | 9 | 9 | 6 | 10 | 8 | 7 |
8.3
Very Good
|
This whitepaper explains the terminology and concepts behind Data Replication technologies and establishes some sizing rules through worked examples. Learn the new paradigm in disaster tolerance—protect data anywhere.
Download now »Server virtualization is a popular option for dealing with mounting datacenter costs. Another equally promising approach is the use of an Application Delivery Controller. Citrix NetScaler provides a low-cost way for organizations to reduce their server count and accrue cost savings from a reduction in space, cooling, power and personnel.
Download now »
The emergence of WLANs has created a new breed of security threats to enterprise networks.
Included in HP ProCurve WLAN solutions is security technology that alleviates threats from WLANs through:
* Monitoring wireless activity inside and out of the enterprise
* Classifying WLAN transmissions into harmful and harmless
* Preventing transmissions that pose a security threat to the enterprise network
* Locating participating devices for physical remediation
Effectively address data protection challenges, implementing solutions that help store and protect businesscritical data while cutting costs and improving efficiency and reliability.
Download now »
Sign up to receive Data Management Resource Alerts
