A quick-and-dirty way to predict human behavior

The Alternating Least Squares (ALS) technique is fundamental to many machine learning applications. Here’s how it works

A quick-and-dirty way to predict human behavior

Machine learning and AI technologies are everywhere. One of the top uses is to predict human behavior.

Luckily, people are creatures of habit. Moreover, when given the freedom to do anything they want, most people will do what everyone else is doing (I’m paraphrasing a badly remembered quote). That makes is kind of easy to predict what people will do next, at least statistically.

Imagine you go to a website and start rating things. First you rate a cat picture, then a baseball, and then a Magpul FMG-9. There were also a few things you didn’t rate on the same page. Assuming that someone else made similar rankings as you, we can probably “guess” what you’d rank the other things.

Take a look at these people and their ratings of things.

als example raw data IDG

If you reorder the most similar users together, you can make pretty decent educated guesses as to what they’d probably rank some of the things that they didn’t actually rank.

als example sorted data IDG

Clearly, User A and User C are pretty similar. User C has ranked Thing 2 where User A hasn’t. User D and User G are very similar, though two things are missing from User D’s rankings that User G ranked. With User E and User F, you can go bidirectional. However, there’s not as much data to measure their similarity compared to what’s available for, say, User A and User C. There is more uncertainty in matching them.

If there are a lot of things to be ranked, it is unlikely you’ll find someone else who ranked everything that you did except for the one or two things that you didn’t rank. That is called a sparse matrix.

The algorithm that many recommendations are based on is called Alternating Least Squares (or some form of it). With ALS, you use a training set or, if you have a lot of users, you can use some of them as the training set to rate the others. It is an ugly formula, and I’m not going to fill InfoWorld with a primer on matrix math. If you want to learn more about ALS in particular, Jamen Long has a great explainer video.

While this kind of recommender is harder to do by hand than, say, classification or clustering, you can probably figure it out. You could come up with some kind of estimated similarity between two users and multiply missing ranking by that number or subtract the “distance” between the users to fill in some of the other things that I didn’t. Either way, algorithms are just doing what a person would probably do: filling in the blanks.

Copyright © 2018 IDG Communications, Inc.