The 12 ways to screw up your predictive analytics project

From politics to wrong assumptions, these mistakes mean you won't need an algorithm to predict the outcome of your analytics project

1 2 Page 2
Page 2 of 2

"All along the way we were challenged by the IT person, who was insulted that he had not been asked to do the work," Deal says. All of the key people who were integral to the project should have been on board before the first meeting started, he says.

Then there was the case of a debt collection firm that had big plans for figuring out how to improve its success rate. Abbot attended the initial launch meeting. "The IT people had control of the data and they were loath to relinquish any control to the business intelligence and data mining groups," he says.

The firm spent hundreds of thousands of dollars developing the models, only to have management put the project into a holding pattern "for evaluation" -- for three years. Since by then the information would have been useless, "holding pattern" was effectively a euphemism for killing the project. "They ran the model and collected statistics on its predictions, but it never was used to change decisions in the organization, so it was a complete waste of time."

"The models were developed but never used because the political hoops weren't connected," Abbott says. So if you want to succeed, build a consensus -- and have C-suite support.

10. If you build it they will come: Don't worry about how to serve it up

OK, you've finally got a predictive model that actually works. Now what?

Organizations often talk extensively about the types of models they want built and the return on investment they expect, but then fail to deploy it successfully to the business.

When consultants at Elder Research ask how the business will deploy the models in the work environment, the response often is "What do you mean by deployment? Don't I just have models that are suddenly working for me?" The answer is no, says Deal.

Deployment strategies, or how the models will be used in the business environment once they are built, can range from very simple -- a spreadsheet or results list given to one person -- to very complex systems where data from multiple sources must be fed into the model.

Most organizations fall into the latter category, Deal says: They have complex processes and huge data sets that require more than just a spreadsheet or results list to make use of the output. Not only do companies have to invest in appropriate analytics software, which could cost $50,000 to $300,000 or more, but they may need software engineering work performed to connect the data source to the software that runs the models.

Finally, they may need to integrate the outputs into a visualization or business intelligence tool that people can use to read and interpret the results. "The deployment of a successful model is sometimes more work than building the model itself," he says.

Even then, the deployment strategy may need to be tweaked to meet the needs of users. For example, the Office of Inspector General for the U.S. Postal Service worked with Elder Research to develop a model for scoring suspicious activities for contract-fraud investigators.

At first the investigators ignored the predictive models. But the tool also gave them access to data they needed for their investigations.

Then the team decided to present the information in a more compelling way, creating heat maps to show which contracts on a map had the highest probability of fraud. Gradually, investigators started to appreciate the head start the scoring gave to their investigations.

Today, some 1,000 investigators are using it. It was a learning moment even for the experts at Elder Research. "We learned a lot about how people use the results, and how they develop an appreciation for the predictive models," Deal says.

11. If the results look obvious, throw out the model

An entertainment-based hospitality business wanted to know the best way to recover high-value, repeat customers who had stopped coming. Abbott Analytics developed a model that showed that 95 percent of the time most of those customers would come back.

"The patterns the model found were rather obvious for the most part. For example, customers who had been coming to the property monthly for several years but then stopped for a few months usually returned again" without any intervention, Abbott says.

The business quickly realized that it didn't need the model to predict what offers would get those customers back -- they expected to recover them anyway -- while the other 5 percent weren't likely to come back at all. "But models can be tremendously valuable if they identify who deviates from the obvious," Abbott says.

Rather than stop there, he suggested that they focus on the substantial number of high-value former customers who the model had predicted would return, but didn't. "Those were the anomalies, the ones to treat with a new program," Abbot says.

"Since we could predict with such high accuracy who would come back, someone who didn't come back was really an anomaly. These were the individuals for whom intervention was necessary."

But the business faced another problem: It didn't have any customer feedback on why they might have stopped coming and the models could not predict why the business had not recovered those customers. "They're going to have to come up with more data to identify the core cause of why they're not returning," Abbott says. Only then can the business start experimenting with emails and offers that address that reason.

12. Don't define clearly and precisely within the business context what the models are supposed to be doing

Abbott once worked on a predictive model for a postal application that needed to predict the accuracy of bar codes it was reading. The catch: The calculation had to be made within 1/500 of a second so that an action could be taken as each document passed through the reader.

Abbott could have come up with an excellent algorithm, but it would have been useless if it couldn't produce the desired result in the timeline given. The model not only needed to make the prediction, but had to do so within a specific time frame -- and that needed to be included in defining the model. So he had to make trade-offs in terms of the algorithms he could use. "The models had to be very simple so that they met the time budget, and that's typical in business," he says.

The model has to fit the business constraints, and those constraints need to be clearly spelled out in the design specification. Unfortunately, he adds, this kind of thinking often doesn't get taught in universities. "Too many people are just trying to build good models but have no idea how the model actually will be used," he says.

Bottom line: Failure is an option
If, after all of this, you think predictive analytics is too difficult, don't be afraid, consultants advise. Abbott explains the consultants' mindset: "You make mistakes along the way, you learn and you adjust," he says. It's worth the effort, he adds. "These algorithms look at data in ways humans can't and help to focus decision making in ways the business wouldn't be able to do otherwise."

"We get called a lot of times after people have tried and failed," says Elder. "It's really hard to do this right. But there's a lot more that people can get out of their data. And if you follow a few simple principles you can do well."

Robert L. Mitchell is a national correspondent for Computerworld. Follow him on Twitter at twitter.com/rmitch, or email him at rmitchell@computerworld.com.

Read more about applications in Computerworld's Applications Topic Center.

This story, "The 12 ways to screw up your predictive analytics project" was originally published by Computerworld.

Copyright © 2013 IDG Communications, Inc.

1 2 Page 2
Page 2 of 2
How to choose a low-code development platform