Your company is getting pumped about AI—but is your data ready?

Before you can implement AI, it’s important to get your data in order. Here are some steps to effectively accomplish that goal

big data blue

In the coming year, more and more companies will make the move to deploy some type of AI solution—whether it is a chatbot, machine learning functionality, or deep learning application. You may have educated yourself on the latest innovations in AI, reviewed different solutions on the market and educated your company on how AI can improve your decision making, automate key processes and improve efficiency. But despite all this planning, the million-dollar question remains: Is your data ready for AI?

Last May, the Economist stated that “the most valuable resource is no longer oil, but data.” All applications of AI require lots of data to create, test and train algorithms. So, without the proper data, AI really has no chance of success—data is really the life source to effective algorithms.

For example, if you run a chain of sporting goods stores, you probably already know how much of each item to stock and how many sales associates to operate each store daily. But what about data on the likelihood of a storm hitting that will keep shoppers away, or the likelihood of a home team winning the Super Bowl and making the demand for team apparel huge in a particular store? Having this type of data would enable an AI algorithm to be so much more precise. AI would enable you to find correlations that would be impossible to do manually.

So, before a company takes the first step to AI implementation, it’s important that it gets its data in order. Below are steps on how you can effectively accomplish this.

Take a data audit

The first step is to assess what data you have. Even when a company has a central database, siloed data resides all over the place. From the finance department to the sales team, and the marketing department to human resources, endless databases and Excel spreadsheets of data abound. Add to the data overload, it often resides in cloud applications used by line of business users without central IT’s knowledge. It’s important to take the time to determine where all your data is, what type of data you have and what additional data you may need to make smarter algorithms.

Assess the data

But having lots of data is not enough; it needs to be quality data to be meaningful. In fact, IBM estimates that bad data is costing organizations some $3.1 trillion a year in the US alone. Once you know where your data has been hiding, it’s time to assess what you have. For example, you may have tons of contacts in a CRM system, but most fields may be missing vital information like email addresses, phone numbers, etc. Or the name could have been entered incorrectly, or contain many duplicate entries.

Clean the data

Once you determine the state and quality of your data, you need to scrub it to eliminate duplicates, correct misspellings and add in the missing information. Additionally, you may need to normalize some of the data fields to make sure you can later aggregate the data. Exceptions that won’t fit the business workflow need to be accounted for and maintained in a specific location. You don’t want your exception data to cause problems when building the predictive algorithms.

Marry it to main data

Once you have quality data, you need to ensure it’s in a centralized location so that the data can be managed, and quality can be maintained. The CIO must have control of the data to be effective. As the lifeblood of not only sound algorithms, but overall business success, the data needs to be treated as a corporate treasure. Corporate protocols and processes need to be followed so that all data is captured, structured, and maintained.

Getting data ready should occur long before AI is implemented—perhaps as long as a year or more depending on the health of your data. But once you’ve gotten your data house in order, it’s time for the AI implementation process to begin. The first step is to ask yourself what the business problem you hope to solve is, what key performance indicators (KPIs) are important to your business, and how will you measure them. Answers to these questions will drive you right back to the data and let you know if you have what you need.


For some companies, they simply would never have the level of rich, insightful data, in the quantities needed to drive quality algorithms. To meet growing demand, insight-as-a-service is emerging as a way to combine structure and unstructured data, external and internal data, into actionable analytics. By using insight-as-a-service, companies can gain a major leap forward in ensuring their AI applications are gaining speed and relevance more quickly than if they started from scratch.

New training data sets are enabling advanced machine learning and deep learning applications that start out smart and get smarter over time to uncover hidden insights and help you make better business decisions. Feeding them quality, relevant data—and lots of it—are keys to success.

Copyright © 2017 IDG Communications, Inc.