Eventually the project collapsed under the weight of its own ambitions. So don't swing for the fences, especially your first time at bat. "Set small, realistic goals, succeed with those, and begin to build from there," Deal advises.
8. Ignore the subject matter experts when building your model
It's a common misconception that to create a great predictive model, you simply insert your data into a black box and turn the crank -- and accurate predictive models just pop out. But data mining experts who take the data, go away, and come back with a model usually end up with flawed results.
That's what happened at a computer repair business that worked with Abbott Analytics. The business wanted to predict which parts a technician should bring for each service call based on the text description of the problem from the customer call record.
"It's hard to pull out key concepts from text in a way that's useful for predictive modeling because language is so ambiguous," Abbott says. The business needed a 90 percent accuracy rate in predicting a parts requirement, and the first models attempted to make predictions based on certain keywords that appeared in the text. "We created a variable for each keyword and populated it with a "1" or "0" indicating the existence of that keyword in the particular problem ticket," which included the text of the customer call.
"We failed miserably," Abbott says.
So he went looking for more data -- from the technicians themselves. "The secret sauce is taking the data you have and augmenting it so that the attributes have more information in them," he says. After speaking with the domain experts, his team came up with an approach that was successful.
"Instead of having hundreds of sparsely populated variables, we condensed this into dozens more information-rich variables, each tied to the historic relationships to parts being needed," Abbott explains. Essentially, they matched up the occurrence of certain keywords in repair histories to discover what percent of the time a part had been needed.
"What we were doing was reworking the data to be more aligned with what an expert would be thinking, instead of relying just on the algorithms to pull things together. This is a trick we use a lot because the algorithms are only so good at pulling together those patterns," he says.
9. Just assume that the keepers of the data will be fully on board and cooperative
Many big predictive analytics projects fail because the initiators didn't cover all of the political bases before proceeding. One of the biggest obstacles can be the people who own the data, who control the data, or who control how business stakeholders can use the data. One Abbott client -- a payday lending firm, which offers short term loans to tide people over until their next paycheck -- never got past the project kickoff meeting due to internal dissent.
"All along the way we were challenged by the IT person, who was insulted that he had not been asked to do the work," Deal says. All of the key people who were integral to the project should have been on board before the first meeting started, he says.
Then there was the case of a debt collection firm that had big plans for figuring out how to improve its success rate. Abbot attended the initial launch meeting. "The IT people had control of the data and they were loath to relinquish any control to the business intelligence and data mining groups," he says.