istock

Machine learning is really as simple as an algorithm that combs over any large data set and corresponding events, looking for patterns that allow it to predict the event in real data. That can be heart rhythms with heart attacks, mass email with identified spam (to find the new spam), and genetic combinations with diseases.

The clearest examples of those are supervised machine learning, but unsupervised machine learning can find patterns and events that might not even be clear from the outset. The one you choose has very much to do with the characteristics of your particular project. Let’s talk about the trade-offs involved with each type.

Supervised Neural Networks

First, let’s look at what neural networks are supposed to accomplish. Supervised networks take data sources (images, rows, play card hands) that are pre-labelled (as human faces, names, card hand values) and learn how to make that connection for any specific input. The goal is to produce reliable solutions to real world problems in a problem domain, such as scanning land for likely oil or mineral deposits underneath. Many neural network projects have measured data on inputs to the system, as well as expected outputs from that data. Our goal, then, is to devise a system that takes those inputs and faithfully reproduces those outputs.

In some cases, we have very good data on the behavior of the system we are modeling. We know what is coming in, and we know what the output is. This seems like a situation where the algorithm should be very definite, but that is rarely the case.

Of course, it is somewhat more complicated than that. The problem is that the data can be ambiguous, and sometimes even contradictory. Consider, for example, trying to get the computer to recognize faces based on an internet search for the word ‘face.’ There will be a great deal of fuzzy information in the system, such as cartoon images, watch faces, and so on. The fuzzier the data, the harder it will be for the software to accurately predict what a face looks like in a picture. When the software runs in production, the answers may be less than accurate — or completely wrong. So not only does the programmer need to create a correct algorithm, but we also need to make sure the data is fit for use. Even pictures of humans faces from the front might not be good enough if we need to identify profiles (i.e., the human from the side) in production.

Real life is even messier than what we might find in a straightforward stimulus-response output. Imagine a retail recommendation engine, which is an increasingly common application of machine learning. We may in fact recommend something based on a strong keyword in the search string, but it might have a very different meaning than our engine may think.

That recommendation is supposed to provide shoppers with similar purchases to consider based on their initial selections. Recommendation engines may also provide similar products based on what others have purchased. In most cases, these recommendation engines offer products that are reasonable approximations of something that a buyer may prefer. In other cases, they may be wildly off-base, so we have to determine what constitutes an acceptable solution.

Unsupervised Networks

On the other hand, with unsupervised neural networks, we don’t know what to label things. Instead of trying to guess names in Facebook with example data, all we have is the example data. In many cases, we have no idea what the output is supposed to be. The computer needs to create groupings which a human might label later. This is an entirely different class of problems. We don’t know what the right answer is. Instead, we are attempting to either group or sort - to optimize based on a dependent variable or variables.

That may be optimizing based on revenue or some other characteristic such as time or effort. In order to build that network, we need to have a clear and unambiguous measure of the desired output. The training process, in which we feed the network data in hopes of optimizing for that output, needs to be objectively defined so that we can guide the network to a solution.

To be fair, in most cases we don’t care about exactly how accurate it is. Instead, we want something vastly preferable to guessing or random sampling. Neither type of network will likely provide us with an optimum solution, so we also have to determine whether a solution meets our needs. But both types of networks serve an important purpose in finding a good answer to a complex set of problems, whether in industry, retail, and even fashion.

