What deep learning really means

GPUs in the cloud put the predictive power of deep neural networks within reach of every developer

1 2 Page 2
Page 2 of 2

Besides image recognition, CNNs have been applied to natural language processing, drug discovery, and playing Go.

Natural language processing (NLP) is another major application area for deep learning. In addition to the machine translation problem addressed by Google Translate, major NLP tasks include automatic summarization, co-reference resolution, discourse analysis, morphological segmentation, named entity recognition, natural language generation, natural language understanding, part-of-speech tagging, sentiment analysis, and speech recognition.

In addition to CNNs, NLP tasks are often addressed with recurrent neural networks (RNNs), which include the Long Short Term Memory (LSTM) model. As I mentioned earlier, in recurrent neural networks, neurons can influence themselves, either directly, or indirectly through the next layer. In other words, RNNs can have loops, which gives them the ability to persist some information history when processing sequences -- and language is nothing without sequences. LSTMs are a particularly attractive form of RNN that have a more powerful update equation and a more complicated repeating module structure.

Running deep learning

Needless to say, deep CNNs and LSTMs often require serious computing power for training. Remember how the Google Brain team needed a couple thousand GPUs to train the new A.I. version of Google Translate? That's no joke. A training session that takes three hours on one GPU is likely to take 30 hours on a CPU. Also, the kind of GPU matters: For most deep learning packages, you need one or more CUDA-compatible Nvidia GPUs with enough internal memory to run your models.

That may mean you'll want to run your training in the cloud: AWS, Azure, and Bluemix all offer instances with GPUs as of this writing, as will Google early in 2017.

While the biggest cloud GPU instances can cost $14 per hour to run, there are less expensive alternatives. An AWS instance with a single GPU can cost less than $1 per hour to run, and the Azure Batch Shipyard and its deep learning recipes using the NC series of GPU-enabled instances run your training in a compute pool, with the small NC6 instances going for 90 cents an hour.

Yes, you can and should install your deep learning package of choice on your own computer for learning purposes, whether or not it has a suitable GPU. But when it comes time to train models at scale, you probably won't want to limit yourself to the hardware you happen to have on site.

For deeper learning

You can learn a lot about deep learning simply by installing one of the deep learning packages, trying out its samples, and reading its tutorials. For more depth, consider one or more of the following resources:

Related articles

Copyright © 2017 IDG Communications, Inc.

1 2 Page 2
Page 2 of 2
How to choose a low-code development platform