Innovation, such as that required to create and deploy BA (business analytics) solutions, is generally an easier process for
smaller, focused development groups. So I’m seriously impressed by what SPSS has been able to accomplish in the BA tool area
with the newest version of its data mining workbench, Clementine 8.1.

Clementine 8.1
SPSS, spss.com
|
Very Good 8.3 |
 |
| criteria |
score |
weight |
| Ease-of-use |
9 |
20% |
 |
| Interoperability |
9 |
20% |
 |
| Reporting |
6 |
20% |
 |
| Suitability |
10 |
20% |
 |
| Scalability |
8 |
10% |
 |
| Value |
7 |
10% |
 |
|
 |
Cost: Solutions start at $75,000
Platforms: Client: Windows (Me, XP Home, Professional, 2000, 2003, or NT 4.0 with Service Pack 6); Server: Windows 2000 (Professional,
Advanced Server, NT 4.0 with Service Pack 6 or later), Sun Solaris, HP-UX 11i, IBM AIX 4.3.3 or 5.2, OS/400 (iSeries) V5R2
with OS/400
Bottom Line: Clementine's client, server, and add-on products offer a deep set of BI, BA, and classic data-mining capabilities with an
elegant, productive interface for a user team with an analytical background. Excellent import and export power makes not only
data but routines available to other applications.
|
 |
About our Reviews and Scoring Methodology
|
|
|
|
Given SPSS’ role in the market, I expected a more pro-forma approach — the two behemoths of statistical analysis, SPSS and
the SAS Institute, dominate the user base for sophisticated BI and data mining applications. I was pleasantly surprised by
the attention SPSS paid to both usability and breadth of features, aspects that big companies with large installed bases tend
to cut corners on.
Clementine 8.1 has a sensible design and eminently practical user interface. The BA features neither degrade what’s already
there nor disappear into the massive capabilities that anchor the Clementine data mining product family.
The underlying workbench design uses a graphical representation of the analyst’s own process workflow. The data mining workflowrequires formulating the right cluster of questions to ask, identifying a subset of data from the warehouse or mart that addresses
the questions, cleaning and restructuring the data, loading it, running it iteratively until you have a predictive model,
and then saving the work for reuse.
Clementine supports all of this work except the purely human-expertise task of creating the right set of questions. That makes
the goal of the data mining client — attacking large stores of collected data and pulling out meaningful relationships that
hint at or even sometimes scream out actions to take — easier to achieve. For shops already committed to SPSS infrastructure,
choosing Clementine is a no-brainer; for those with mixed platforms, Clementine’s virtues make it a very strong choice.
Going graphical
Clementine’s tabbed tools palettes sequentially collect related steps in the workflow process, grouping them into “nodes.”
An analyst-user drags these nodes to the work window, connecting them in a structured, graphical sequence to create workflows
that SPSS calls “streams”; multiple, related streams form a project. Clementine maintains a logical structure to manage these
work products, with tabbed storage areas to store and display them. Users may also draw from previously created work modules.
In its tersest expression, a stream need consist only of a data source node, a process node, and some deliverable, either
a model or a graphical output. In reality, analysts will export the models and procedures to one of the many output formats
Clementine supports, including SPSS, SAS, and SQL. And they’ll use the tools to put a significant slab of the data preparation
back into the database so the work needn’t be re-executed in future data mining.
This workflow diagram model is eminently practical because it follows the standard professional analyst’s structure, and because
the analysts trained for these positions tend to have mastered this form of structured thinking. This makes Clementine’s face
to the user a gloriously productive one. The tabbed palettes of nodes are organized in a way that dedicated analytical pros
will “get” instantly, and those who do a range of work, including analysis, will pick it up quickly.
The tabbed organization of streams, outputs, and trained models also makes it very convenient to reuse them in other projects
or export them to C code or to PMML (Predictive Model Markup Language), an XML-based language for defining and sharing predictive
models between compliant vendors’ applications.
Clementine’s work structure is supported, albeit unevenly, by real-time error messages. When laying down nodes on the work
area, the client won’t allow you to connect things that can’t be connected logically as a sequence and creates an error message
to alert you.
On the other hand, some of the error messages you get at run time in the thorough event log entries will alert you that there
was a failure, but not specify it closely enough to remind you of what you did incorrectly. For that, you have to go to Clementine’s
documentation, a beautifully executed manual and deep, linked, on-line help with a search function and indexing.
For all its elegance, I’m relieved SPSS hasn’t claimed in its marketing or positioning documents that this software can be
an equally powerful tool for non-dedicated staff. It won’t be: The documentation is comprehensive and factual but doesn’t
presume to teach more than the minimum about the craft and statistical tests and models of this platform. The ideal user for
this software is still the staffer whose job is dedicated to analytics and statistics.
No small commitment
On the BA side, SPSS made it easier to trigger iterative efforts by providing more visual muscle to models with graphical
cross-tabs and better visualization of cluster graphics. A data audit node and reclassification capabilities support quicker
data retuning, which in turn supports more exhaustive, iterative engagement with the analysis. A new utility, Cleo, also deploys
models to the Web for viewing and interaction.
The breadth of the Clementine platform offering makes it a big commitment. The product’s solid integration with external data
sources and its $75,000 entry price make it most appropriate for dedicated analysis groups that will make use of and master
the full platform.
Clementine is a mature platform, but is expanding its capabilities and moving more surely into newer techniques such as BA.
Its user base is drawing third-party products — such as Kxen’s Analytic Framework— that add even more tools to the kit. Clementine’s connections to enterprise data sources and development tools make it a
leading platform for supporting smart decisions in an economy that offers no additional margin for hiring or slack.