# C# Matrix Chromosome implementation in Genetic Algorithm

The next version of GPdotNET will be extended with 3 new solvers:

– Traveling Salesman Problem (TSP),
– Assignment problem (AP) and
– Transportation Problem (TP)

More or less the first two solvers are already presented in previous posts. Both TSP and AP can be implemented with vector based representation, which is already posted here. For transportation problem new chromosome representation have to be implemented. Before we go in to the implementation let give short introduction of Transportation problems.

Transportation problem is one of the simplest combinatorial problem. It searches the determination of a minimum cost transportation plan for a single commodity from a number of sources to a number of destinations. It requires the specification of the level of supply at each source, the amount of demand at each destination, and the transportation cost from each source to each destination. Solution is to find the amount to be shipped from each source to each destination in such a way that the total transportation cost have to be minimized.
Let’s assume we have $n$ sources and $k$ destinations. Let’s also assume that amount of supply at source $i$ is $SRC(i)$ and demand at destination $j$ is $DST(j)$. The unit transportation cost between source $i$ and destination $j$ is $CST(i,j)$. Let $x_{ij}$ is the amount transported from source $i$ to destination $j$ transportation problem can be formulated like the following:
Minimize:

$min { \sum_{i=1}^{n}\sum_{j=1}^{m} c_{ij} x_{ij}}$

${ \sum_{i=1}^{n} x_{ij}=a_i}, (i=1,...,n)$

${ \sum_{j=1}^{m} x_{ij}=b_j, (j=1,...,m)}$

${ x_{ij}>=0, (i=1,...,n), (j=1,...m)}$

A typical transformation problem is shown in picture below. In this example we have 3 sources and 4 destination. The supply is shown on Supply column, and destination cost is shown on the last row. Total supply and demand is 45. The unit transportation cost is shown on central table.

The optimal solution is shown in picture below. The total cost is 315.

The most natural way to represent this kind of problem is 2 dimensional matrix, because our solution must be presented as picture show above in tabular data.

Theory behind matrix representation chromosome is beyond this blog post, so in the next section it will be shows only C# implementation with some description. If you want more information behind this implementation you can find it in the book: Genetics Algorithm + Data Structure = Evolution Programs which can be see at Google books at pages 187-191.

Implementation of main method: GenerateMatrix:

///
/// Initialize randomly matrix chromosome for transport problem based on book
/// Genetic Algorithm+Data Structure= Evolution programs
///
///r- number of rows (sources)
///c- number of columns (destination)
///src[] - source values
///dest[] - destination values
/// Cost matrix cost(i,j)
internal static int[][] GenerateMatrix(int r, int c, int[] src, int[] dest)
{
//initi values
int tot = r * c;
var lst = new List(tot);

//prepare values for generation
int counter = 0;
var vals = new int[r][];
for (int i = 0; i < r; i++)
{
vals[i] = new int[c];
for (int j = 0; j < c; j++)
{
counter++;
}
}

while (true)
{
//exit from while must be before list is empty
if (lst.Count == 0)
throw new Exception("null");

int q = lst[ind];

int i = (int)Math.Floor((double)(q) / (double)(c));
int j = q % c;

//if element is visited generate again random number
if (vals[i][j] != 0)
continue;

lst.RemoveAt(ind);

int val = Math.Min(src[i], dest[j]);
vals[i][j] = val;
src[i] = src[i] - val;
dest[j] = dest[j] - val;

bool canBreak = true;
for (int k = 0; k < r; k++)
if (src[k] != 0)
{
canBreak = false;
break;
}

if (canBreak == false)
continue;

for (int k = 0; k < c; k++)
if (dest[k] != 0)
{
canBreak = false;
break;
}
//if all sources and destination are zero, generation is finished
if (canBreak)
return vals;
}
}


This is main function for this type of chromosome because all other operations is based on this method.
The second implementation is Crossover.

///
// Crossover based on Book Genetic Algorithm +Data Structure = Evolution Programs.
public void Crossover(IChromosome parent2)
{

GAMChromosome ch2= parent2 as GAMChromosome;
if(ch2==null)
throw new Exception("ch2 cannot be null!");

int[] srcREM1 = new int[rows];
int[] destREM1 = new int[cols];
int[] srcREM2 = new int[rows];
int[] destREM2 = new int[cols];

for (int i = 0; i < rows; i++)
{
for (int j = 0; j < cols; j++)
{
double m1 = value[i][j] + ch2.value[i][j];
double d = Math.Floor(m1 / 2.0);
int r = (((int)m1) % 2);

value[i][j] = (int)d;
ch2.value[i][j] = (int)d;
srcREM1[i] += r;
destREM1[j] += r;
srcREM2[i] += r;
destREM2[j] += r;
}
}
for (int i = 0; i < rows; i++)
{
srcREM1[i] /= 2;
srcREM2[i] /= 2;
}
for (int j = 0; j < cols; j++)
{
destREM1[j] /= 2;
destREM2[j] /= 2;
}
var mat1 = GenerateMatrix(rows, cols, srcREM1, destREM1);
var mat2 = GenerateMatrix(rows, cols, srcREM2, destREM2);
for (int i = 0; i < rows; i++)
{
for (int j = 0; j < cols; j++)
{
value[i][j] += mat1[i][j];
ch2.value[i][j] += mat2[i][j];
}
}
return;
}


As we can see crossover method contain two calls for GenerateMatrix method which randomly initialize sub matrix chromosome.

Mutation operation also depends of  GenerateMatrix because it randomly choose location which needs to be recreated. The following code contains two Mutate methods where the second one calls GenerateMatrix for subMatrix random generation.

///  Mutation based on Book Genetic Algorithm +Data Structure = Evolution Programs.
public void Mutate()
{
//choose random number of cols and rows
//define array for holding random indexs
var localRows = new int[locRows];
var localCols = new int[locCols];

//generate random source
int counter = 0;
while (true)
{

var found = false;
for (var i = 0; i < localRows.Length; i++)
{
if (localRows[i] == num)
{
found = true;
break;
}
}
if (!found)
{
localRows[counter] = num;
counter++;
}
if (counter  == localRows.Length)
break;
}
//generate random destination
counter = 0;
while (true)
{

var found = false;
for (var i = 0; i < localCols.Length; i++)
{
if (localCols[i] == num)
{
found = true;
break;
}
}
if (!found)
{
localCols[counter] = num;
counter++;
}
if (counter == localCols.Length)
break;
}
//perform mutation
Mutate(locRows, locCols, localRows, localCols, this);

}

internal static int[][] Mutate(int r, int c, int[] rs, int[] cs, GAMChromosome ch)
{
var source = new int[r];
var destination = new int[c];
//calculate Source for random submatrix
for (int i = 0; i < rs.Length; i++)
for (int j = 0; j < cs.Length; j++)
source[i]+= ch.value[rs[i]][cs[j]];

//calculate Destination for random submatrix
for (int i = 0; i < cs.Length; i++)
for (int j = 0; j < rs.Length; j++)
destination[i] += ch.value[rs[j]][cs[i]];

var subMatrix= GAMChromosome.GenerateMatrix(r, c, source, destination);

//merge generated submatrix to matrix
for (int i = 0; i < rs.Length; i++)
for (int j = 0; j < cs.Length; j++)
ch.value[rs[i]][cs[j]] = subMatrix[i][j];

return subMatrix;
}


This was the code implementation around Matrix based Chromosome representation for Genetic Algorithm. The full source code about it will be published with the first beta of GPdotNET v3.0. I can promisse that the first beta will be out for less that month.
Stay tuned

# GPdotNET vNext: Assignment and Transportation problems

GPdotNET vNext

Recently I wrote blog post  about implementation of TSP problem in GPdotNET. I got positive feedback about implementation, but there were people saying if can I implement more problems based on linear programming and optimization. So I have decided to implement Assignment and Transportation problems in GPdotNET. With existing 4,  the next version of GPdotNET 2.5 will contains in total the following problem domains:

1. Modelling discrete data set with GP,

2. Modelling and optimization discrete data set with combination of GP and GA,

3. Modelling Time series data set with GP,

4. Optimization of Analytic function with GA,

5. Traveling Salesman Problem with GA,

6. Assignment Problem with GA,

7. Transportation problem with GA.

In order to implement those problems it has to be implemented two new chromosome types:

1. Vector based GA chromosome for solving TSP and Assigment problem

2. Matrix based GA chromosome for solving Transport problem.

The first one (vector based chromosome type) is already implemented and needs to be modified for accepting different format data, in order to fully support TSP and Assignment problems at the same time. For second type there are several solutions for implementation for Matrix based chromosome, one of my favorite is published in the book “Genetic Algorithms + Data Structures = Evolution Programs”.

I can say this would be an exciting summer. :)

# Function optimization with Genetic Algorithm by using GPdotNET

Content

1. Introduction
2. Analytic function optimization module in GPdotNET
3. Examples of function optimizations
4. C# Implementation behind GPdotNET Optimization module

Introduction

GPdotNET is artificial intelligence tool for applying Genetic Programming and Genetic Algorithm in modeling and optimization of various engineering problems. It is .NET (Mono) application written in C# programming language which can run on both Windows and Linux based OS, or any OS which can run Mono framework. On the other hand GPdotNET is very easy to use. Even if you have no deep knowledge of GP and GA, you can apply those methods in finding solution. The project can be used in modeling any kind of engineering process, which can be described with discrete data. It is also be used in education during teaching students about evolutionary methods, mainly GP and GA. GPdotNET is open source project hosted at http://gpdotnet.codeplex.com

With releasing of  GPdotNET v2 it is also possible to find optimum for any analytic function regardless of independent variables. For example you can find optimum value for an analytically defined function with 2, 5, 10 or 100 independent variables. By using classic methods, function optimization of 3 or more independent variables is very difficult and sometimes impossible. It is also very hard to find optimum value for functions which are relatively complex regardless of number of independent variables.
Because GPdotNET is based on Genetic Algorithm we can find approximated optimum value of any function regardless of the limitation of number of independent variables, or complex definition. This blog post is going to give you a detailed and full description how to use GPdotNET to optimize function. Blog post will also cover C# implementation behind optimization proce by showing representation of Chromosome with real number, as well as Fitness calculation which is based on Genetic Programming tree expression. In this blog post it will also be presented several real world problem of optimization which will be solved with GPdotNET.

# Analitic Function Optimization Module in GPdotNET

When GPdotNET is opened you can choose several predefined and calucalted models from various domain problems, as weel as creating New model among other options. By choosing New model new dialog box appears like picture below.

By choosing Optimization of Analytic Function (see pic above) and pressing OK button, GPdotNET prepares model for optimization and opens 3 tab pages:

1. Analytic function,
2. Settings and
3. Optimize Model.

## Analytic function

By using “Analytic function” tab you can define expression of a function. More information about how to define mathematics expression of analytic function can be found on this blog post.

By using “Analytic definition tool” at the bottom of the page, it is possible to define analytic expression. Expression tree builder generates function in Genetic Programming Expression tree, because GPdotNET fully implements both methods. Sharing features of Genetic programming  in Optimization based Genetic Algorithm is unique and it is implement only in GPdotNET.

When the process of defining function is finished, press Finish button in order to proceed with further actions. Finish button action apply all changes with Optimization Model Tab. So if you have made some changed in function definition, by pressing Finish button changes will be send to optimization tab.
Defining expression of function is relatively simple, but it is still not natural way for defining function, and will be changed in the future. For example on picture 2, you can see Expression tree which represent:

$f(x,y)=y sin{4x}+1.1 x sin{2y}$.

## Setting GA parameters

The second step in optimization is setting Genetic Algorithm parameter which will be used in optimization process. Open the Setting tab and set the main GA parameters, see pic. 3.

To successfully applied GA in the Optimization, it is necessary to define:

1.  population size,
2. probabilities of genetic operators and
3. selection methods.

These parameters are general for all GA and GP models. More information about parameters you can find at http://bhrnjica.net/gpdotnet.

## Optimize model (running optimization)

When GA parameters are defined, we can start with optimization by selecting Optimization model tab. Before run, we have to define constrains for each independent variables. This is only limitation we have to define i  order to start optimization. The picture below shows how to define constrains in 3 steps:

1.  select row by left mouse click,
2. enter min and max value in text boxes
3. Press update button.

Perform these 3 actions for each independent variable defined in the function.

When the process of defining constrains is finished, it is time to run the calculation by pressing Optimize button, from the main toolbar(green button). During optimization process GPdotNET is presenting nice animation of fitness values, as well as showing current best optimal value. The picture above shows the result of optimization process with GPdotNET. It can be seen that the optimal value for this sample is $f_{opt}(9.96)=-100.22$.

# Examples of function optimization

In this topic we are going to calculate optimal value for some functions by using GPdotNET. Zo be prove that the optimal value is correct or very close to correct value we will use Wolfram Alpha or other method.

### Function: x sin(4x)+1.1 x sin(2y)

GP Expression tree looks like the following picture (left size):

Optimal value is found (right above picture) for 0.054 min, at 363 generation of total of 500 generation. Optimal value is f(8.66,9.03)=-18.59.

Here is Wolfram Alpha calculation of the same function. http://www.wolframalpha.com/input/?i=min+x*sin%284*x%29%2B+1.1+*y*+sin%282+*y%29%2C+0%3Cx%3C10%2C0%3Cy%3C10

### Function:  (x^2+x)cos(x),  -10≤x≤10

GP expression tree looks like the following picture (left size):

Optimal value is found for 0.125 min, at 10 generation of total of 500 generation. Optimal value is F(9.62)=-100.22.

Here is Wolfram Alpha calculation of the same function. http://www.wolframalpha.com/input/?i=minimum+%28x%5E2%2Bx%29*cos%28x%29+over+%5B-10%2C10%5D

### Easom’s function fEaso(x1,x2)=-cos(x1)•cos(x2)•exp(-((x1-pi)^2+(x2-pi)^2)), -100<=x(i)<=100, i=1:2.

GP expression tree looks like the following picture (left size):

Optimal value is found for 0.061 min, at 477 generation of total of 500 generation. Optimal value is F(9.62)=-1, for x=y=3.14.

Function can be seen at this MatLab link.

# C# Implementation behind GPdotNET Optimization module

GPdotNET Optimization module is just a part which is incorporated in to GPdotNET Engine. Specific implementation for this module is Chromosome implementation, as well as Fitness function. Chromosome implementation is based on  floating point value instead of classic binary representation. Such a Chromosome contains array of floating point values and each element array represent function independent variable. If the function contains two independent variables (x,y) chromosome implementation will contains array with two floating points. Constrains of chromosome values represent constrains we defined during settings of the optimization process. The following source code listing shows implementation of GANumChrosomome class in GPdotNET:

public class GANumChromosome: IChromosome
{
private double[] val = null;
private float fitness = float.MinValue;
//... rest of implementation
}


When the chromosome is generated array elements get values randomly generated between min and max value defined by function definition. Here is a source code of Generate method.

///
/// Generate values for each represented variable
///
public void Generate(int param = 0)
{
if(val==null)
val = new double[functionSet.GetNumVariables()];
else if (val.Length != functionSet.GetNumVariables())
val = new double[functionSet.GetNumVariables()];

for (int i = 0; i < functionSet.GetNumVariables(); i++)

}


Mutation is accomplish when randomly chosen array element randomly change itc value. Here is a listing:

///
///  Select array element randomly and randomly change itc value
///
public void Mutate()
{
//randomly select array element
//randomly generate value for the selected element
}


Crossover is little bit complicated. It is implemented based on Book Practical Genetic Algoritms see pages 56,57,48,59. Here is an implementation:

///
/// Generate number between 0-1.
/// For each element array of two chromosomes exchange value based on random number.
///
///
public void Crossover(IChromosome ch2)
{
GANumChromosome p = (GANumChromosome)ch2;
double beta;
for (int i = crossoverPoint; i < functionSet.GetNumVariables(); i++)
{
val[i] = val[i] - beta * (val[i] - p.val[i]);
p.val[i] = p.val[i] + beta * (val[i] - p.val[i]);
}
}


Fitness function for Optimization is straightforward, it evaluates each chromosome against tree expression. For minimum the better chromosome is lower value. For maximum better chromosome is the chromosome with higher fitness value. Here is a implementation of Optimizatio Fitness function:

///
/// Evaluates function agains terminals
///
///
///
///
public float Evaluate(IChromosome chromosome, IFunctionSet functionSet)
{
GANumChromosome ch = chromosome as GANumChromosome;
if (ch == null)
return 0;
else
{
//prepare terminals
var term = Globals.gpterminals.SingleTrainingData;
for (int i = 0; i < ch.val.Length; i++)
term[i] = ch.val[i];

var y = functionSet.Evaluate(_funToOptimize, -1);

if (double.IsNaN(y) || double.IsInfinity(y))
y = float.NaN;

//Save output in to output variable
term[term.Length - 1] = y;

if (IsMinimize)
y *= -1;

return (float)y;
}
}


# Summary

We have seen that Function optimization module within GPdotNET is powerful optimization tool. It can find pretty close solution for very complex functions regardless of number of independent variables. Optimization module use Genetic Algorithm method with floating point value chromosome representation described in several books about GA. It is fast, simple and can be used in education as well as in solving real problems. More info about GPdotNET can be found at http://bhrnjica.net/gpdotnet.

# GEALIB – aplikacija za modeliranje i optimizaciju evolucijskim algoritmima

Napomena: Juni 2010. Ovaj tekst predstavlja početak razvoja aplikacije koja je poslije prerasla u GPdotNET open source projekat.
Dio rada iz predanog u maju 2008 godine sa postdiplomskog studija, vezano za aplikaciju za modeliranje i optimizaciju evolucijskim algoritmima.

U pripremi Seminarskih radova na postdiplomskom studiju, za obradu eksperimentalnih podataka procesa bušenja kao i eksplozivne obrade, razvijana je .NET biblioteka napisana u programskom jeziku C# za primjenu evolucijskih algoritama u obradnim procesima.

Aplikacija podržava proizvoljan broj nezavisnih ulaznih varijabli Xi i jednu zavisno promjenjivu izlaznu varijablu Y.

Slika 14: Osnovni Izgled GEALIB aplikacije

#### Učitavanje eksperimentalnih podataka

Početak rada u aplikaciji počinje učitavanjem exsperimentalnih podataka iz datoteke. Aplikacija podržava CSV (Comma Separated Value) extenziju. Učitavanje podataka počinje klikom da dugme „Učitaj…“.

Nakon učitavanja podataka podaci se raspoređuju dijagramski te tabelarno kao na slici 13. U dijagramu su prikazan vrijednosti preko broja pokusa (x osa) i vrijednosti izlazne promjenjive (y- osa).

#### Podešavanje parametara genetskog programiranja

Nakon učitavanja eksperimentalnih podataka, počinje proces podešavanja GP parametara. Klikom da Tab „Podešavanje“, pojavljuje se dijaloški okvir za podašavanje osnovnih parametara GP.

Slika 15: Parametri GP

Osnovni parametri GP su:

1. Veličina populacije – broj hromosoma u populaciji

2. Maksimalna dubina drveta hromosoma pri inicijalizaciji, ukrštanju i mutaciji

3. Vjerojatnost operacija za ukršatanje, mutaciji i reprodukciju

4. Kriterij odabira – najvažniji parametar GP. Pored višestruke regresije GEALIB podržava i ostale kriterije.

Metode selekcije mogu biti: Elite,Rank, Roulet

5. Metoda inicijalizacije hromosoma: Puna, Rastuća i Miješana.

#### Izbor Terimana i Funkcija u GP

Nakon definisanja osnovnih parametara GP, slijedi izbor terminala i funkcija u GP. Naredna slika pokazuje dostupne funkcije u GEALIB.

Slika 16: Definisanje Funkcija i Terminala u GEALIB

Pored skupa nezavisno promjenjivih Xi u Terminale spadaju i slučajno generirane konstante koje se u ovom fazi podašavanja GP definišu. Sa slike se može vidjeti da se konstante mogu generisati iz proizvoljnog intervala, kao broj konstanti koji se generiše.

#### Pokretanje GP

Pokretanje GP započinje definisanje broja evolucija, kroz koje će GP proći dok ne završi proces. Proces se može i prije završiti ako se dobije tačan model. Startanje GP implementirano je na način da se GP pokreće u pozadinskoj niti, dok glavna nit programa ostaje slobodna i korisnik može prelaziti iz tabulatora.

Dodatno je implementirana i opcija nasilnog zaustavljanja GP, koja poslije zaustavljanja izračunava najbolji horomosom koji je izračunat i prikazuje ga.

Slika 17: GEALIB u izračunavanju modela

Tokom izračunavanja modela, desni dijagram prikazuje simulaciju najboljih rješenja dobijenih tokom evolucije. U donjem dijelu je progres kontrola za prikaz statusa GP.

Kada aplikacija postigne tačan rezultat ili maksimalan broj iteracija (evolucija) zaustavlja se GP i prikazje se najbolje dobijeno rješenje.

U tabulatoru „Rezultati“ prikazano je najbolje rješenje kao i tabelarna komparacija eksperimentalnih rezultata i rezulatata dobijenog pomoću GP. U ovoj fazu GP završava svoj proce i počinje proces optimizacije dobijenog modela pomoću Genetskog Algoritma.

Dobijeni model prikazan je u notaciji koji je kompatibilan sa MS Excel, kao i MS Word, tako da se jednostavni kopiranje model može importovati u Excel za daljnju obradu i preračunavanje modela.

Slika 18: Prikaz modela i komaparacija rezulatata

#### Optimizacija GP modela pomoću GA

Nakon dobijanja GP modela nastupa faza optimizacije modela pomoću GA. Donja slika prikazuje interfejs za optimizaciju. Slično kao i kod GP prolazi se proce definisanja parametara GA i pokretanje GA.

Slika 19: Podešavanje parametrara GA za optimizaciju

Aplikacija je testirana na bezbroj primjera kako analitičkih funkcija tako i eksperimantalnih podataka. Sadrži sve opcije i parametre klasične metode genetskog programiranja i optimizacije genetskim algoritmom koji ima mogućnost reprezentacije pomoću binarnog, cjelobrojnog i realnog broja.

Aplikacija radi sa proizvoljnim brojem nezavisnih varijabli i samo jednom zavisno promjenjivom.