An Intro to Genetic Algorithms


Genetic algorithms are a form of machine learning that is focused on optimizing a particular output or outputs based on successive production of derived equations.  The approach can be useful for optimizing a particular result when no training data are available, and when the optimization isn’t known mathematically.

These algorithms use the combination of selection, recombination, and mutation to evolve a solution to a problem.  At a descriptive level, the intent of a genetic algorithm is to make a set of attempts at solving a problem using randomly selected equations.  The first time through, it’s likely that none of the solutions are very good.  However, some are better than others.  You take two of the better ones, combine their parameters, and generate an entirely new sequence of possible equations, derived from those two.

They’re called genetic algorithms because the process is loosely modeled after the workings of genetic selection – that is, through a combination of heredity, variation, and selection. 

Algorithms are evaluated based on some criteria, and the best algorithms are combined to form a next generation of algorithms, some of which at least should perform better.

A key to propagating a genetic algorithm is its fitness – that is, its ability to make incrementally and successively better predictions.  Fitness is a measure that the system designer develops to reflect the overall success of the application.  It could be to maximize revenue, achieve a high throughput, or some other measure that represents the business or technical goal.

The difference between a genetic algorithm and a neural network is subtle but real.  Neural networks use layers of (usually) relatively simple algorithms that apply transformations to data across multiple layers to model a particular process.  A neural network attempts to replicate often known results based on a training dataset.  Typically the only thing that changes from successive neural network training runs is the algorithm coefficients, although the number of layers or nodes in each layer can also be changed.

Genetic algorithms, on the other hand, search a domain based on the biological principles of genetics.  They apply successive efforts with sets of algorithms based on the previous set in order to optimize the fitness score.  Also, typically a genetic algorithm attempts to optimize; that is, find a relative fitness in the search space.  Searches parallel across the space, so that it doesn’t focus on a local optimization.  It’s possible, even likely, that the fitness function may go down during successive generations before rising again.

This type of algorithm works using the chromosome, which is a range of potential solutions, rather than attempting to optimize parameters as in a neural network.  The chromosome is often represented as a bit string, although it can take any type of numeric form.  It applies the “fitness score,” rather than a known optimization.  The end result is that it has the potential to find a “good enough” solution, rather than an optimal solution.

Consider trying to spell out a particular word or phrase using a genetic algorithm.  You might start with three random sets of letters – two of which contain letters in the desired word, and one which does not.  If you are using correct letters as your fitness function, your algorithm might take the two results with the desired letters, and combine them into new sets of letters.  Based on the information contained in the preceding generation, this has the potential to result in a better fitness score.  Eventually, the algorithm series may converge on the right answer, or at least a “good enough” answer.

You can also introduce mutation into the process; that is, injecting random values into the next generation.  That has the effect of possibly accelerating the process of reaching an optimal result.

Using genetic algorithms can result in reasonable answers by successively applying better prepared algorithms to a problem.  While they may not be used repetitively in general-purpose applications, they can find answers that may not be found in any other way.

Now it’s easier than ever to write your code to run in parallel  - Try Intel® Parallel Studio XE for free for 30 days




Copyright © 2016 IDG Communications, Inc.