**CON**

**PRO**

**?**

Introduction

A GA is a

software algorithm

that mimicks the process of

natural selection

What are Genetic Algorithms (GA)?

Terminology

How do GAs work?

**Genetic Algorithms**

Example: The Knapsack Problem

Implementation in Python

Pros and Cons

**Istvan Boerner Horvath, April 2014**

**Solving optimization problems with**

For many real-world optimization problems there are

no efficient software algorithms

Often, only a

brute-force method

can be used

For

combinatorial problems

, it means that all possible combinations have to be tested

With a large number of combinations this can

take a lot of time

or is even not feasible on today's hardware

And this is where

genetic algorithms

come into play

It uses the concepts of

Selection

Crossover

Mutation

to find useful (but not always the best) solutions to optimization and search problems

History

Applications

Genetic

Algorithms

Bioinformatics

Computational Science

Engineering

Economics

Manufacturing

Physics

Selection

Time needed to solve the knapsack problem (combinatorial)

Challenge: from a set of items with various masses and values,

select the ones that

add up to the highest overall value

,

and at the same time

do not exceed the weight limit

of the knapsack

Population of candidate

solutions (individuals)

Next generation

of individuals

Implementation in Python

For every item there are only

two states

: included or not included in the knapsack

In most programming languages this can be encoded as an integer where the bits represent the states. In Python, I chose a

list

of 0s and 1s, as Python has powerful methods for handling lists

This is the

genome

of an individual, where each item is represented by a number in the list: 1 for included, 0 for not included

[0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, ... ]

To create a

starting population

, another list is filled with all individual's (randomly initialized) genomes

A validation function makes sure that each genome is a

valid solution

, ie. that the included items weigh less than the knapsack is able to carry

A single valid solution of an optimization problem is called

individual

An individual's genetic material is called

genome

or

chromosome

. It stores the solution as an encoded set of values

A group of individuals is called a

population

Evolution

The

size of the starting population

depends on the specific problem - often a size of 32 to 100 individuals is enough, but sometimes more than 1000 individuals are necessary to get good results

Now that we have a starting population, the GA computes

each individual's fitness

and saves the fittest as the best solution

Fitness

Calculation

The fitness is simply the

value

of all items for a single solution

A

fitness of 0

means that the weight of the items exceeds the maximum carrying weight

Selection

Different methods can be used, I chose

the

tournament selection

:

Two individuals are

randomly selected

and

of those two the fitter one is used

A configurable percentage of individuals is furthermore selected from the

"elite"

population ("Elitism")

Crossover

Mutation

Termination

A GA usually

terminates

, if

the

best solution

of a population does not improve over a number of iterations, or

a

maximum number of iterations

has been reached

Graph showing the development of the

best solution (green line) and the

mean fitness value of the population (blue line)

Used where brute-force methods are

not feasible

...or other methods are simply

not available

Can be applied to a

lot of problem domains

Scales well

on parallel systems

Not guaranteed

to find the global optimum

The quality of a result is often

hard to validate

Sometimes it is

difficult to find an encoding

and a good fitness function

Must be

fine-tuned

to get good results

Genomes of the selected mother and father individuals are

combined

with an adjustable crossover probability, using the

Single Point Crossover

method:

This is then

repeated

to get a "mother" and a "father"

The children's genomes are then

mutated

with a certain probability, by inverting a randomly chosen number in the genome

(0 to 1 or 1 to 0):

Note: for other problems

different mutation methods

might be more suitable

0, 1, 1, 0,

0

, 0, 1, 1, 0, ...

0, 1, 1, 0,

1

, 0, 1, 1, 0, ...

Number of items / possible combinations

Brute force algorithm

Genetic algorithm

2 = 4.2 * 10 combinations

32 items /

9

64 items /

2 = 1.8 * 10 combinations

19

32

64

Based on C++ implementation using OpenMP on a 2.4 GHz quad-core processor

0.15 - 0.3 secs

128 secs

17,000 years

0.5 - 1 secs

Genetic algorithms are based on works of

Ingo Rechenberg

and

Hans-Paul Schwefel

(late 1960s / early 1970s)

They became popular through the work of

John Holland

in the 1970s

The

First International Conference on Genetic Algorithms

was held in Pittsburgh, Pennsylvania

in 1985

Fitter individuals have a

higher chance of survival

and thus a higher chance to pass on their genetic material

Individuals from the current population are

selected based on their fitness

The fitness indicates

how well an individual is adapted

to the environment

Crossover

Mutation

The genetic information of two individuals is

combined

Chromosomes are

randomly modified

before the genetic information is passed on to the offspring

Fitness

measures the quality of a solution