Embrace and extend Excel for AI data prep

Combining machine learning and Excel can get you the data transformation you need while data scientists are scarce.

A road to enterprise AI through Excel

Sometimes the best IT solution is the one you already have. Not always, of course: Cloud infrastructure, for example, tends to yield much more flexibility and choice than private data centers. Unless you’re Hey!, in which case you’ll make the argument that a private data center is the right way to go.

The key, as my colleague David Linthicum has stressed, is not to indulge in “buzzword-oriented architecture,” wherein enterprises might “spend twice as much modernizing a workload that didn’t really need to be containerized, all because someone wanted to put containers on their CV.”

The problem isn’t containers. Or cloud. Or [insert hot tech du jour here]. No, the problem is wildly applying industry buzzwords to a business problem rather than letting the business problem dictate the solution.

Given how frantic enterprises are to apply magic machine learning pixie dust to their business challenges, machine learning and artificial intelligence (ML/AI) is one area where it pays to be thoughtful. Given the relative dearth of ML/AI talent today, it’s worth seeing how you can better use the talent already employed by your company, rather than praying you’ll be able to hire a data scientist to magically uncover insights in your data. One better approach might be to make better use of the world’s most popular data tool to get data ready for machine learning models. Yes, I’m talking about Excel.

Seeing beyond the ChatGPT hype

New advances in artificial intelligence are opening up opportunities for millions of people to start creating content of all kinds through machine learning, from code to copy to art. Since its public release in November 2022, ChatGPT has hogged headlines around the world and led to a rush of business applications, along with many examples of abusive ChatGPT feedback, fears of cheating on essays and exams, and more.

Google has come out with a Chrome extension called GPT for Sheets, which allows users to manipulate data with conversational language; Microsoft says it will integrate ChatGPT into all of its products, with Bing first. Microsoft recently invested $10 billion in OpenAI, the creators of ChatGPT.

But as exciting (and sometimes disappointing) as ChatGPT applications may be, there’s a much more mundane—and promising—approach to machine learning that’s already available.

Excel jockeys, start your ML engines

I’ve written before about Akkio, a machine learning company that combines no code and AI, and how Democrats turned the tool into a money-printing machine in the 2022 election cycle. Akkio has launched Chat Data Prep, a cool new machine learning platform that allows users to transform data using ordinary conversational language. The technical term is natural language processing, but the less buzzwordy way of thinking about it is that it can transform how Excel users work and enable them to embrace the promise of AI much more easily.

An estimated 750 million people worldwide use Excel. Microsoft CEO Satya Nadella has proclaimed Excel the company’s most important consumer product. Turning Excel into a machine learning power tool could go a long way toward making machine learning something ordinary enterprise employees can finally tap into.

“One of the things we were trying to figure out was how to build all the transformations you need on your data to use AI, even on our easy no-code ML platform,” said Akkio cofounder Jonathan Reilly in an interview. “Then we realized we could just use ML to accomplish this task. No organization wants financial planning people spending their time importing and exporting and manipulating data—they want them to focus on what the data is telling them.”

Akkio’s new feature lets users simply type in conversational language to make changes to their spreadsheet data. Leveraging AI and large language models, the platform interprets the user’s requests and makes the necessary changes to the data. It’s surprisingly easy. See for yourself at Akkio’s online demo (not gated).

Data power to the people

Why does this matter? You may be paying data scientists six figures to put your data to work, but most of their time is spent on data transformation, aka data wrangling. This is the technical process of converting data from one format, standard, or structure to another, without changing the content of the data sets, in order to prepare it for consumption by a machine learning model. Data prep is the equivalent of janitorial work, albeit incredibly important work. Transformation increases the efficiency of business and analytic processes, and it enables businesses to make better data-driven decisions. But it’s difficult and time-consuming unless the user is familiar with Python or the popular query language SQL.

For example, there are several steps involved, starting with data cleaning (converting data type and removing unnecessary characters). Here is a hypothetical example of transformations someone who knows SQL or Python might make to harmonize multiple data sets for use in a machine learning model:

Transform year of birth to “Age”

Subtract current year from Year_Birth.

Transform the date customer enrolled (“Dt_Customer”) into “Enrollment_Length”

It is similar to the one above but with the addition of extracting the year part from the date feature.

Transform currency (“Income”) into numbers (“Income_M$”)

This involves four steps:

  1. Clean data by removing characters “, $ .”
  2. Substitute null value to 0
  3. Convert string into integer
  4. Scale down the numbers into the million dollar format, which helps with visualizing the data distribution.

And on and on.

Not many of Excel’s three-quarters of a billion users have even these basic programming chops. But any one of them could type in a simple request in ordinary English and Chat Data Prep will do the heavy lifting of data transformation. It even provides a preview of your results so you can check that the output is what you wanted. Akkio claims that Chat Data Prep results in a 10-fold reduction in the time it takes to prepare data for analysis. With Chat Data Prep, users can reformat dates, perform time-based math operations, and even fix messy data fields with a simple conversational command.

Making data analysis more accessible, efficient, and accurate is one of the mundane magic tricks that AI makes increasingly possible, quietly and behind the scenes. ChatGPT will get the headlines, but your Excel users just might do the heavy lifting of machine learning transformation within the enterprise.

Copyright © 2023 IDG Communications, Inc.