[ Also on CIO.com: Gartner: Hadoop Will Be in Two-Thirds of Advanced Analytics Products By 2015 | Report: Open Source Should Come First When Choosing New Enterprise IT | Feature: How Big Data Can Bring BI, Predictive Analytics Together | Feature: Equifax Eyes Are Watching You; Big Data Means Big Brother ]
This aligns with a view I've held a long time that I characterize as "the migration of margin." Great software companies have been built on proprietary infrastructure (namely, Oracle), but the day has passed for this kind of company. Open source is going to rule software infrastructure going forward. Where will high-margin opportunity reside? Further up the stack, particularly in verticals, where domain expertise is required and open source is poorly suited to address market requirements.
Algorithms changing business intelligence
The most interesting part of the event for me, however, was the glimpse of the future of analytics -- and it's not business intelligence, or at least BI as we've traditionally known it. Both the opening keynote, by Kaggle CEO Anthony Goldbloom, and the closing keynote, by Mike Gualtieri of Forrester, focused on predictive analytics.
You may remember the great Netflix contest, in which the company offered a big prize to anyone who could improve its recommendation engine by 10 predictive or more.
The core of that effort was predictive analytics, in which an algorithm -- probably dozens or hundreds of algorithms -- is unleashed on a subset of a data collection to see if it can discern a pattern of data elements that's associated with some other interesting outcome. When a predictive algorithm is identified, it's set against another subset of the data collection to see if it can predict how that outcome turned out for the records in the second subset.
Gualtieri example was mobile churn rates. A wireless company could examine marital status, payment pattern (early, on time, or late), usage amounts, and so on to assess whether an analysis across those elements could predict whether a subscriber was likely to terminate his or her contract. (Cynics will state the obvious: Wouldn't it be easier to offer better service as an inducement to reduce churn?)
An obvious extension of this process is evolving the algorithms to further improve predictive power in a process dubbed "machine learning." Kaggle, by the way, focuses on organizing and running predictive analytics competitions, and Goldbloom offered a fascinating example: Could a machine learning system evaluate student essays better than human teachers? The answer was "Yes," especially since the software had far less variance in evaluation than a pool of teachers would.