What is deep reinforcement learning: The next step in AI and deep learning

Reinforcement learning is well-suited for autonomous decision-making where supervised learning or unsupervised learning techniques alone can’t do the job

What is deep reinforcement learning: The next step in AI and deep learning

Reinforcement learning has traditionally occupied a niche status in the world of artificial intelligence. But reinforcement learning has started to assume a larger role in many AI initiatives in the past few years. Its application sweet spot is in calculation of optimal actions to be taken by agents in environmentally contextualized decision scenarios.

Using trial-and-error approaches to maximize an algorithmic reward function, reinforcement learning is well suited to many adaptive-control and multiagent automation applications in IT operations management, energy, health care, commerce, finance, transportation, and finance. And it’s being used to train the AI that powers both its traditional focus areas—robotics, gaming, and simulation—and a new generation of AI solutions in edge analytics, natural language processing, machine translation, computer vision, and digital assistants.

Reinforcement learning is also fundamental to the development of autonomous edge applications in the internet of things. Much of edge application development—for industrial, transportation, health care, and consumer applications—involves building AI-infused robotics that can operate with varying degrees of contextual autonomy under dynamic environmental circumstances.

How reinforcement learning works

In such application domains, edge devices’ AI brains must rely on reinforcement learning, in which, lacking a pre-existing “ground truth” training data set, they seek to maximize a cumulative reward function, such as assembling a manufactured component according to a set of criteria included in a spec. This is in contrast to how other types of AI learn, which is either by (as with supervised learning) minimizing an algorithmic loss function with respect to the ground truth data or (as with unsupervised learning) minimizing a distance function among data points.

However, these AI learning methods are not necessarily silos. One of the most interesting AI trends is the convergence of reinforcement learning with supervised and unsupervised learning in more advanced applications. AI developers are blending these approaches in applications for which no single learning method is sufficient.

For example, by itself, supervised learning is useless in the absence of labeled training data, which is often lacking in applications such as autonomous driving, where every split-second environmental circumstance is essentially unlabeled and unique. Likewise, unsupervised learning—which uses cluster analysis to detect patterns in sensor feeds and other complex unlabeled data—is not geared to identifying the optimal action that an intelligent endpoint should take in a real-world decisioning scenario.

What is deep reinforcement learning

Then there’s deep reinforcement learning, a leading-edge technique in which autonomous agents use reinforcement learning’s trial-and-error algorithms and cumulative-reward functions to accelerate neural network designs. These designs are what power many AI applications that depend on supervised and/or unsupervised learning.

Deep reinforcement learning is a core focus area in the automation of AI development and training pipelines. It involves the use of reinforcement learning-driven agents to rapidly explore the performance trade-offs associated with the myriad architectures, node types, connections, hyperparameter settings, and other options available to designers of deep learning, machine learning, and other AI models.

For example, researchers are using deep reinforcement learning to quickly ascertain which of myriad deep-learning convolutional neural network (CNN) architectures might be best suited to various challenges in feature engineering, computer vision, and image classification. The results gained through deep reinforcement learning might then be used by AI tools to autogenerate the optimal CNN, using deep-learning development tools like TensorFlow, MXNet, or PyTorch for that task.

In that regard, it’s encouraging to see the emergence of open frameworks for reinforcement-learning development and training. As you explore deep reinforcement learning, you’ll probably want to explore the following reinforcement learning frameworks that leverage, extend, and interface with TensorFlow and other deep-learning and machine-learning modeling tools that have gained broad adoption:

The reinforcement-learning skills that AI developers need

Going forward, AI developers will need to immerse themselves in the wide range of reinforcement learning algorithms implemented in these and other frameworks. You will also need to deepen your understanding of multiagent reinforcement-learning architectures, many of which heavily leverage the established body of game-theory research. You will also need to familiarize yourself with deep reinforcement learning as a tool for identifying security vulnerabilities in computer vision applications associated with an attack method known as “fuzzing.”

Last but not least, here are some excellent resources for developers needing to bootstrap their skills in the convergence of reinforcement learning and deep learning:

Copyright © 2018 IDG Communications, Inc.