Demystifying machine learning

Businesses need more than a machine learning engine or framework to successfully extract the valuable and actionable insights from their data—and not all machine learning solutions are created equal

emerging tech ai machine learning

Last year was massive for the creation of exponential technologies leveraging artificial intelligence and machine learning. AI is a hot topic. Without realizing it, many of us now interact with AI on a daily, if not an hourly, basis.

Yet AI and robotics aren’t new. For instance, robotics has been used in manufacturing for decades, and Siri has been telling us bad jokes for nearly seven years. However, there is confusion around what constitutes AI and machine learning, and the capabilities of each of these technologies. So, let me define what AI and machine learning are, and see if it helps demystify the technologies and clear some of the fog that exists in the market.

Artificial intelligence vs. machine learning

AI is a platform or a solution that appears to be intelligent and can often exceed the performance of humans. It is a broad description of any device that mimics human or intellectual functions, such as mechanical movement, reasoning, or problem solving.

Machine learning is a statistical and data-driven approach to creating AI, such as when a computer program learns from data to improve its performance. As a result, machine learning is dependent on data, and often with these approaches the quality of the data or the process of creating the data is vital to the success of the machine.

Yet there is often a tendency to inaccurately conflate machine learning and artificial intelligence. Machine learning code is undoubtedly a key subset of artificial intelligence code, but machine learning alone is not artificially intelligent. While most people are familiar with machine learning applications like recognizing and mimicking human speech or identifying a person in a photo, AI is a much broader field which involves expression beyond what is commonly known.

Not all rules are the same

One of the methods that seems to cause the most confusion in the new world of AI are rules. Rules are used in many ways across many enterprise platforms, and particularly with machine learning.

Take the example of a decision tree. This is a rule-based machine learning method, as the model builds a set of rules that depict the classification path. In addition to this model, experts often use rules to generate a taxonomy and natural language processing (NLP) information-space regarding documents. This is something my team has been very successful with, because rule-based NLP is totally predictable and accurate when it matches. It allows the system to perform functions, such as validation and normalization, as with analyzing dates. This, in turn, allows external systems to use the data detected as text within reports and processes.

Extractions, however, are only useful if they can be used in other processes. This can be seen with the best enterprise analytics applications, where a platform extracts the terms from within the physical documents and then normalizes them to allow data from other systems to be compared. By doing this at scale, companies and organizations—especially the enterprise—can save millions of dollars on projects like renegotiations of payment terms or lease agreements.

Combining methods of extraction

Teaching and learning is another area of importance in understanding a baseline innovation. A subject often discussed in the artificial intelligence space concerns the machine learning methods used and how one machine learning method is considered better than another.

For me, the best approach is always to combine multiple methods, because combining allows for the best outcome. One single method will never perform as well as a combination of methods. In my experience, the use of multiple methods and the ability to choose the right combinations set an AI platform apart.

For example, the addition of deep learning with the long short-term memory layer, inside deep neural networks, has allowed for significant improvements in the detection and classification of text and speech. Companies working in this area have noted an estimated 7 percent increase, where previously they had managed no more than 2 percent each year.

Another example is a combination of multiple models and methods in a learning framework. An ensemble, or ensemble learning, is a method that allows for the combination of weak classification methods to produce a strong and accurate extraction model. It can then use rules (or decision trees) to select the best overall extraction.

Using the above methods in my company’s framework has allowed users to select the right model for their needs so they can choose a combination of the natural language processing, different machine learning algorithms and latent semantic indexing to detect and extract the information that works best. I firmly believe that ensuring the right methods, rules and process are used, and in the right combination, is key to achieving extraction goals.

There is another factor in training that is almost always disregarded by many data analytics solution providers, namely standard deviation. The standard deviation, in terms of machine learning, is the trustworthiness and reliability of any model or method to extract information. When you talk about trusting a model, you expect there to be a low standard deviation.

A good model, like any statistical function, needs data, but it also needs an appropriate amount of data before the swings in its learning are smoothed out. This is called the learning curve, and it typically results in a gradual reduction of the standard deviation.

Most data is domain-specific. To apply a model trained in legal or finance to a different field will cause the standard deviation and model to perform differently. Therefore, it is critical to either provide localized models for each domain or have enough data over all domains to provide a single model or ensemble. This becomes harder the more disparate the data is in a domain. This is why data scientists provide fast learning methods that can be seen as weak and slower learning methods that generalize more over the data.

Machine learning does not work in isolation

It is the machine learning engine and its components that make up the broader platform that meets the precision and recall objectives for a task. A machine learning engine cannot provide all the capabilities on its own to deliver the results that businesses require. This usually involves several technologies and techniques working together.

  • Natural language processing (NLP) to optimize the capabilities for the system to understand written language and process it in the machine learning engine.
  • Latent semantic indexing (LSI) for identifying and extracting information not presented in standard terms or language but that exists through associations of words or phrases or in different locations in a document.
  • The use of deep learning methods to increase performance of the machine learning engine.
  • The use of active learning to simplify training and automatically select the best model and hyperparameters for any given data, with users only required to select the text to train on.
  • Included document review capabilities within the system for efficient side-by-side review and comparison across clauses and language.
  • Extensive reporting and data visualization to be able to easily draw actionable insight from the data.
  • Automatic discovery and linkage of related documents such as amendments to master agreements.
  • Simplicity within the UI for information layering and normalization, to allow the machine learning framework to effectively use all available information and to allow users and engineers to quickly find and prepare it for use.
  • A logic engine to assess the extractions and generate pseudo information, filtering and normalization.

If my experience has taught me anything, it’s that businesses need more than an machine learning engine or framework to successfully extract the valuable and actionable insights from their data, and not all machine learning solutions are created equal, particularly as it pertains to contractual documents and paper.

Much of the confusion in the market is owed to broad depth of functionally in AI platforms that lead to real business transformation. These are concepts that people naturally find overwhelming and hard to comprehend.

So, the next time you look at artificial intelligence, you can avoid the confusion by taking a closer look at what AI means and expand on the data to ask probing questions. As the industry grows and evolves, the technology will evolve too.

This article is published as part of the IDG Contributor Network. Want to Join?