Machine learning is undergoing a revolution because of new technologies and methods. Machine learning is a process of using a program to develop capabilities—like the ability to tell spam from desirable email—by analyzing data instead of programming the exact steps, freeing the user from needing to make every decision about how the algorithm functions. Machine learning is a powerful tool, not only because over a million people focus on tedious programming steps every day, but also because it sometimes finds better solutions than humans engaged in manual effort.
Machine learning has applications in most industries, where it presents a great opportunity to improve upon existing processes. However, many businesses are struggling to keep up with the innovations. Finding skilled data scientists is difficult, yes, but the skills shortage does not tell the whole story, particularly for organizations that have made investments but not realized their potential. The most significant obstacles are related to a gap between data scientists with the skills to implement the methods and business leaders who can drive necessary organizational changes.
Making machine learning successful in an organization requires a holistic strategy that involves specialists and non-specialists alike. It requires focusing the organization, analyzing business cases to determine where machine learning can add value, and managing the risks of a new methodology. For example, a data science team may be interested in using machine learning but choose not to do so because of time constraints, risk aversion, or lack of familiarity. In these situations, a better approach may be to create a separate project, with a focus on creating a foundation for future projects. Once the organization has working examples of machine learning, the bar for future implementations is significantly lower.
The implication is that non-specialists in the organization need to participate in the machine learning vision to make it a success, and this starts with a common understanding. Learning the analysis and math behind data science takes years, but it is important for business leaders, analysts, and developers to at least understand where to apply the technology, how it is applied, and its basic concepts.
Using machine learning requires a different way of approaching a problem: You let the machine learning algorithm solve the problem. This is a shift in mindset for people familiar with thinking through functional steps. It takes some trust that the machine learning program will produce results and an understanding that patience may be required.
Machine learning and deep learning
Why is machine learning so powerful? There are many different processes (facilitated by algorithms) for making machine learning work, which I will discuss in detail below, but the ones at the leading edge use neural networks, which share a structure similar to that of a biological brain. Neural networks have multiple layers of connectivity, and when there are many complex layers it is called a deep neural network.
Deep neural networks have had limited success until recently, when scientists took advantage of the GPU commonly used for displaying 3D graphics. They realized that GPUs have a massive amount of parallel computing power and used them to train neural networks. The results were so effective that incumbents were caught off guard. The process of training a deep neural network is known as deep learning.
Deep learning came of age in 2012 when a Canadian team entered the first GPU-trained neural network algorithm into a leading image recognition contest and beat the competition by a large margin. The next year, 60 percent of the entries used deep learning, and the following year (2014), almost every entry used it.
Since then, we have seen some remarkable success stories come out of Silicon Valley, giving companies like Google, Amazon, PayPal, and Microsoft new capabilities to serve their customers and understand their markets. For example, Google used its DeepMind system to reduce the energy needed for cooling its data centers by 40 percent. At PayPal, deep learning is used to detect fraud and money laundering.
Outside this center of gravity there have been some other success stories. For example, the Icahn School of Medicine at Mount Sinai leveraged Nvidia GPUs to build a tool called Deep Patient that can analyze a patient’s medical history to predict nearly 80 diseases up to one year prior to onset. The Japanese insurance company, AXA, was able to increase its prediction rate of auto accidents from 40 percent to 78 percent by applying a deep learning model.
Supervised learning and unsupervised learning
At a basic level there are two types of machine learning: supervised and unsupervised learning. Sometimes these types are broken down further (e.g. semi-supervised and reinforcement learning) but this article will focus on the basics.
In the case of supervised learning, you train a model to make predictions by passing it examples with known inputs and outputs. Once the model has seen enough examples, it can predict a probable output from similar inputs.
For example, if you want a model that can predict the probability that someone will suffer a medical condition, then you would need historical records of a random population of people where the records indicate risk factors and whether they suffered from the condition. The results of the prediction can’t be better than the quality of the data used for training. A data scientist will often withhold some of the data from the training and use it to test the accuracy of the predictions.
With unsupervised learning, you want an algorithm to find patterns in the data and you don’t have examples to give it. In the case of clustering, the algorithm would categorize the data into groups. For example, if you are running a marketing campaign, a clustering algorithm could find groups of customers that need different marketing messages and discover specialized groups you may not have known about.
In the case of association, you want the algorithm to find rules that describe the data. For example, the algorithm may have found that people who purchase beer on Mondays also buy diapers. With this knowledge you could remind beer customers on Mondays to buy diapers and try to upsell specific brands.
As I noted above, machine learning applications take some vision beyond an understanding of math and algorithms. They require a joint effort between people who understand the business, people who understand the algorithms, and leaders who can focus the organization.
The machine learning workflow
The implementation of a machine learning model involves a number of steps beyond simply executing the algorithm. For the process to work at the scale of an organization, business analysts and developers should be involved in some of the steps. The workflow is often referred to as a lifecycle and can be summarized with the following five steps. Note that some steps don’t apply to unsupervised learning.
- Data collection: For deep learning to work well, you need a large quantity of accurate and consistent data. Sometimes data needs to be gathered and related from separate sources. Although this is the first step, it is often the most difficult.
- Data preparation: In this step, an analyst determines what parts of the data become inputs and outputs. For example, if you are trying to determine the probability of a customer to cancel a service, then you would join separate sets of data together, pick out the relevant indicators that the model would need, and clear up ambiguities in those indicators.
- Training: In this step, specialists take over. They choose the best algorithm and iteratively tweak it while comparing its predicted values to actual values to see how well it works. Depending on the type of learning, you can expect to know its level of accuracy. In the case of deep learning, this step can be computationally intensive and require many hours of GPU time.
- Inference: If the objective was for the model to make a prediction (e.g., supervised learning), then the model can be deployed so that it responds quickly to queries. You give it the same inputs as you selected during the data preparation except that the output is a prediction.
- Feedback: This is an optional step, where information from the inferencing is used to update the model so its accuracy can be improved.
The below example shows parts of a workflow for a supervised learning model. A big data store on Kinetica, a GPU-accelerated database, contains the training data that is accessed by a model leveraging ML features of the database as part of the learning step. The model is then deployed to a production system where an application requests low latency responses. The data from the application is added to the set of training data to improve the model.
Using the right platform for analytics is also important, because some machine learning workflows can create bottlenecks between business users and data science teams. For example, platforms like Spark and Hadoop might need to move large amounts of data into GPU processing nodes before they can begin work, and this can take minutes or hours, while restricting accessibility for business users. A high-performance GPU-powered database like Kinetica can accelerate machine learning workloads by eliminating the data movement and bringing the processing directly to the data. In this scenario, results can be returned in seconds, which enables an interactive process.
Machine learning algorithms
Before GPUs supercharged the training of deep neural networks, the implementations were dominated by a variety of algorithms, some of which have been around longer than computers. They still have their place in many use cases because of their simplicity and speed. Many introductory data science courses start by teaching linear regression for the prediction of continuous variables and logistic regression for the prediction of categories. K-means clustering is also a commonly used algorithm for unsupervised learning.
Deep neural networks, the algorithms behind deep learning, have many of the same applications as most of the traditional machine learning algorithms, but can scale to much more sophisticated and complex use cases. Inference is relatively fast, but training is compute-intensive, often requiring many hours of GPU time.
The following diagram shows a graphical representation of a deep learning model for image recognition. In this example, the input is an image and nodes are neurons that progressively pick out more complex features until they output a code indicating the result.
The image recognition example is called a convolutional neural network (CNN) because each neuron contains image masks and uses a technique called convolution to apply the mask to the image data. There are other types of deep neural networks like recurrent neural networks (RNN) that can work with time series data to make financial forecasts and generic multi-layer networks that work with simple variables.
An important thing to consider is that, unlike many traditional machine learning algorithms, deep neural networks are difficult or impossible to reverse engineer. More to the point, you can’t always determine how an inference is made. This is because the algorithm might populate weights in many thousands of neurons, and find solutions that can’t always be understood by humans. Credit scoring is an example where deep neural networks should not be applied if you want to understand how the score is determined.
Machine learning frameworks
Writing machine learning models from scratch can be tedious. To make implementations easier, frameworks are available that hide complexities and lower the hurdles for data scientists and developers. The following logos belong to some of the more popular machine learning frameworks.
Google, for example, offers a popular framework called TensorFlow that is famous for its ability to support image and speech recognition, and it provides a suite of tools for model visualization in TensorBoard (see below).
TensorFlow was designed to make it easy to train deep neural networks in parallel and on multiple GPUs, but it also supports traditional algorithms. It can work in combination with big data platforms like Hadoop and Spark for massively parallel workloads. In situations where data movement can be a bottleneck, the Kinetica platform uses native TensorFlow integration to bring GPU-accelerated workloads directly to large data sets.
TensorFlow makes an abstraction between the model (called an estimator) and the algorithm (called an optimizer), allowing a user to select from multiple algorithms when training a model. For example, a specialist could write a supervised learning model using simple linear regression as the algorithm, and then compare its accuracy against a deep neural network algorithm.
The rise of machine learning has striking parallels to the rise of the Internet. For decades both were studied by university researchers and saw limited commercial use. The Internet, based on a network that was turned on in 1969, had a coming of age in the 90’s that disrupted industries with incumbents who reacted slowly until their businesses were marginalized. Now many of the same companies that rose to prominence with the Internet are leading the adoption of machine learning, while incumbents try to understand its significance and extract value from their data science investments.
Any software project that gives an organization insight into its business requires the close participation between business users and people with skills to translate business requirements into code. Most organizations with software investments are familiar with this paradigm. A key difference is that while machine learning requires a definition of the problem, its purpose is to find a solution.
The widespread adoption of machine learning requires at least a black-box level of understanding from business analysts and software developers as they engage with data science teams. It also requires business leaders with a vision of how they will get value from using machine learning to solve problems previously addressed with carefully defined rules. Making successful use of machine learning does not require most people involved to understand the details of how machine learning works. But they need to understand enough to ask the right questions from the data science experts.
Chad Juliano is a senior solutions architect for Kinetica. Previously, Chad was a senior principal consultant for Oracle. Prior to Oracle, he worked as a software engineer at Quorum Business Solutions. Chad also had prior experience at Portal Software. Chad earned a double major in electrical engineering and math from Southern Methodist University in Dallas.
New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to firstname.lastname@example.org.