Clementine 10 reinforces defensive analytics
Upgrades to SPSS's BA app power anomaly detection, faster answers in graphical modeling systemFollow @infoworld
When I last reviewed SPSS’s Clementine BA (business analytics) application in 2004, I found great virtue in its effective graphical interface and its capability to systemize the building and storage of analytical routines for later reuse or reassembly.
Since then, the demand for BA muscle has shifted radically from CRM toward the two U.S. economic sectors that are in steep growth: health care and national security. Clementine 10 aims squarely at meeting those sectors’ needs, but in doing so, it also opens up plenty of applications for shops in other markets.
Health care and national security need BA for targeting exceptions. Health care is looking to root out fraud or doctors who spend higher-than-average time with their patients, as well as to get ahead of the spread of epidemics or biological attacks. National security customers are looking to target threats from malicious groups accurately, among other tasks.
Clementine 10 sports new anomaly-detection analytics to address these types of issues. SPSS brought together routines so the analyst has intrinsics with which to work, greatly simplifying what would have previously required multinode construction. That functionality saves the analyst time and frees up design effort for analysis.
Adding anomaly-detection analytics amplifies productivity, as well. Analysts can now focus on anomalies when necessary, such as for fraud-detection efforts. The process also filters out the anomalous cases, which makes it useful for work such as market segmentation, where the anomalies tend to muddy and slow the identification of segments.
The new version also improves analyst productivity with another modeling node, Feature Selection. This node quickly delivers a list of the most useful fields to include in predictive models and the ones to filter out as chaff. It acts as an analyst on a small scale, ranking and rating each field you’ve identified as input. (You can save a Feature Selection for future reuse).
Because of these preratings, an analyst gets to the subset of critical fields more quickly, and SPSS says that by working with fewer fields Clementine will squeeze out faster hardware processing times when answering questions. I found that Feature Selection got me to answers a lot faster, but this was mostly through saving user time rather than faster system processing.
I especially appreciate Clementine 10’s upgrades to time-series analysis, something that required external tools to run smoothly in prior versions. The new features will require a healthy training investment, but analysis of data over time increasingly will become the central focus of predictive analytics, which justifies the investment.
Some of 10’s additions are merely “nice to haves.” Although previous versions imported a wide range of data sources, many of them had to come piped through ODBC connections, a mechanism that mandates IT create extra security and permissions and mandates IT support of the end-users. Clementine 10, however, supports direct connection of Excel files (ranges and worksheets), saving IT as well as user time.
There was a small glitch in two of the Excel files I attached this way: Empty columns at the end of worksheets that had been cleared of their data left behind extraneous column delimiters. Not “knowing” where the worksheet ends is a common Excel brain cramp, and it’s not a giant issue, but I’d like to see SPSS clean up this irregularity within the input routine.