Constructing Effective Neural Networks


Many commercial artificial intelligence and machine learning systems today use neural networks as their decision-making engine.  Neural networks use layers of algebraic equations to process input data to produce accurate results in problem domains that require analysis and judgment.  Known input data that describes the intended behavior of the system is also known as training data, which is used to adjust the parameters of the network to produce incrementally better results for subsequent training runs.

Neural networks can be used for supervised or unsupervised learning.  Supervised learning occurs when we have training input data with known and accurate outputs, so that we can train the network to converge on the known result for a given set of inputs.  We can also have unsupervised learning, where we try to converge on an optimal result, without necessarily knowing what that result is.  In unsupervised learning, we rarely if ever reach a mathematical optimization, but we attempt to reach a result that is quantifiably better than random chance.

Training the Network

What does it mean to train a network?  For a supervised network, you have known inputs and known results.  You provide the known inputs into the network, and compare the output to the known outputs.  This means that to even start a machine learning application, you have to have quantitative behavior on its expected behavior.

The first time the data runs through the network, the results probably won’t be that good.  It has no prior knowledge of the problem domain and how to produce reasonable outputs.  But neural networks are designed to adjust the parameters in their equations based on actual and expected outputs, so the next time the input data cycles through the network, the results should be incrementally better.  And so on through the training process.

And by continually cycling the data through the network, it will hopefully converge on an acceptable solution.  The solution won’t be perfect, which means that you have to define what is acceptable before you go down this path.  If it doesn’t converge on a reasonable solution set, you may have to start all over again with your network architecture.

And when I say hopefully, the design of your network has a lot to do with your outcome.  The algorithms selected and the number of network layers and nodes have a lot to do with the results.  In many cases, you can design a network that closely tracks the training data but is too specific to be sufficiently accurate as a production system.

Consider the design of an electronic wind sensor.  Based on the cooling of filaments, it aims to determine the wind speed and direction.  We have training data based on the precise speed and direction, and the amount the filaments had cooled.  From this data, the neural network attempted to replicate the exact wind speed and direction.

An unsupervised network is similar, except that you don’t have known results.  Instead, you are trying to optimize on a particular output.  Depending on the application, you may be trying to maximize revenue, minimize time expended, or some other known quantity.  You almost certainly don’t know what the optimal result is, but you keep trying to better your last result.  When you aren’t doing that any more, your network is complete.

Designing a Neural Network

Designing an effective neural network is often a matter of experience and instinct.  The number of input nodes, the number of transition nodes and the number of nodes in each, and the algebraic transformations used are all subject to the discretion of the architect.  There are no correct choices, but rather guidelines that may not always be useful in building the network.

There are a few design considerations in building an effective network.  First, the network shouldn’t be overly complex.  You may think that a network with several layers and a large number of nodes help you fit your training data more accurately.  That may be the case, but remember that production use is more than just training data.

Especially if your application encounters noisy data in production, it can’t specifically be defined to be an extremely close match for only the training data.  Instead, you want a more general network that can produce reasonably accurate results.

It’s also important to note that the designer rarely has control over the training process.  Most neural network software does its own analysis of the actual versus expected results, and adjusts the algebraic parameters based on optimization algorithms built into the software.

Supporting Software Makes a Difference

Commercial and open source software provide some significant advantages to architects and developers building machine learning systems with neural networks.  In most cases, it is time-consuming and unnecessary to code your own algorithms, optimizations, nodes, and layers into a neural network.  Unless you have a unique problem domain, there is software to accelerate building an intelligent system.

First, libraries are available that assist developers in algorithm selection, execution performance, harnesses for testing and training, and execution acceleration.  Intel Data Analytics Acceleration Library (Intel DAAL), as well as enhancements to the Intel Math Kernel Library (Intel MKL), coupled with the Xeon Phi processor, let you compute both training runs and actual production code much faster than on most processors.  The ability to execute both training and production runs quickly enables machine learning systems to process data significantly faster.  This means that it’s possible to evaluate more data—and respond more quickly and with more accuracy—than with conventional software and systems.

Second, both commercial and open source neural network solutions are available to quickly design neural network systems, from algorithm and initial parameter selection, and build software around the network.  Products such as Intel Parallel Studio make it possible to integrate all of these features, as well as the libraries, into a single development and build environment.  Implementing machine learning systems just got significantly easier.

Design, development, test, and production deployment of machine learning systems don’t have to be slow and complex.  Using the latest software from Intel will accelerate your efforts and result in faster and more accurate solutions.

Now it’s easier than ever to write your code to run in parallel  - Try Intel® Parallel Studio XE for free for 30 days