3 ways to get into reinforcement learning

Whether you like theoretical study or want to get your hands dirty, plenty of reinforcement learning resources are out there.

When I was in graduate school in the 1990s, one of my favorite classes was neural networks. Back then, we didn’t have access to TensorFlow, PyTorch, or Keras; we programmed neurons, neural networks, and learning algorithms by hand with the formulas from textbooks. We didn’t have access to cloud computing, and we coded sequential experiments that often ran overnight. There weren’t platforms like Alteryx, Dataiku, SageMaker, or SAS to enable a machine learning proof of concept or manage the end-to-end MLops lifecycles.

I was most interested in reinforcement learning algorithms, and I recall writing hundreds of reward functions to stabilize an inverted pendulum. I never got it working and was never sure whether I coded the algorithms incorrectly, chose less-optimal reward functions, or selected imperfect learning parameters. But today, I can find examples of reinforcement learning applied to the inverted pendulum problem and even the schematics to build one.

Reinforcement learning explained

Reinforcement learning is a teaching algorithm. A subject operates in an environment with a current state and actions that it can perform. In this case, the subject is an inverted pendulum placed on a cart that can move left or right in a straight line. The position and velocity of the pendulum and the cart holding the pendulum represent the state. The cart can move in only one dimension, either left or right, to balance the pendulum.

Instead of programming the cart’s action with a bunch of rules, the cart is given a reward function to score the outcomes based on its actions. As the cart moves, the reward function computes a score, and higher scores are given when the pendulum is upright. A reinforcement learning algorithm uses the reward function to tune a neural network based on the function’s scores.

The initial trials will fail, as the pendulum keeps falling. However, with enough attempts, a well-chosen reward function, and optimally selected tuning parameters, the algorithm learns the correct actions to control the cart and balance the pendulum.

Many articles are available to guide you further on the basics of reinforcement learning. You can read overviews of reinforcement learning, learn the basics, jump into its math and algorithms, review research papers, or discover real-world applications.

Getting into more details or experiments will require selecting a programming language, choosing a framework, picking tools, and configuring a cloud environment. I confess that this is an undertaking, so I went looking for opportunities to learn without getting my hands too dirty.

Here’s what I found:

1. Combine work and play with AWS DeepRacer

AWS introduced DeepRacer in November 2018 as the “fastest way to get rolling with machine learning.” In December 2020, they had more than 10,000 competitors and a grand prize that included $10,000 of AWS promotional credits.

Don’t let the competition scare you away, because DeepRacer is a superb learning tool. Your objective is to train the racer to navigate autonomously around a selected racetrack.

 When you sign up for DeepRacer, you get access to a simulator where you can select a track, code a reward function, and adjust tuning parameters. There is a default reward function with tuning parameters to start training your racer and evaluating its performance. From there, you’re off to the races to improve your models and tune the algorithms.

You have more than 20 tracks to choose from and can select from simple time trials to head-to-head racing. You can also purchase a physical DeepRacer, load it with your algorithms, and design tracks to run competitive races.

It didn’t take me long to figure out ways to improve the provided reward function. The basic function scores how far the DeepRacer is from the center of the track, with the highest scores when the racer is on the centerline. I improved the algorithm by factoring in the racer’s steering angles, giving it a higher reward when it was steering toward the centerline.

I felt pretty good that with only my second model and 10 minutes of training, my DeepRacer made it around 26% of the track. Of course, my simple model doesn’t work when you factor in obstacles and other racers. You can go it alone to improve your DeepRacer’s performance, or you can learn from others’ code libraries and racing experiences.  

2. Be inspired by recent accomplishments

It isn’t difficult to find real-world examples of business, academic, and government organizations experimenting and succeeding with reinforcement learning. Consider these recent headlines:

Several good websites track news in AI and reinforcement learning, including AI Trends, AI News, AI Business, the MIT News page on AI, ScienceDaily’s page on AI, and Berkeley AI Research blog.

3. Experiment with code examples

Before embarking on your reinforcement learning journey, you might want to check out coding examples or books, especially when applied to familiar problems. The following options are worth reviewing:

Lastly, if you’re ready to develop reinforcement learning expertise, consider these courses from Coursera, Harvard, MIT, Stanford, Udacity, Udemy, or review these free options.

Given how hard it is to teach and learn by example, reinforcement learning and other unsupervised learning techniques are areas of growth and opportunity. Even if you are a couple of steps behind in grasping machine learning techniques, understanding reinforcement learning is a chance to develop expertise while academics, industry, and government evolve the science and algorithms.

Copyright © 2021 IDG Communications, Inc.