Linkage Learning Genetic Algorithm In C++ PDF

Title Linkage Learning Genetic Algorithm In C++
Author Fernando Lobo
Pages 11
File Size 202.3 KB
File Type PDF
Total Downloads 51
Total Views 165

Summary

Linkage Learning Genetic Algorithm in C++ Fernando G. Lobo Georges R. Harik David E. Goldberg IlliGAL Report No. 98006 April 1998 Illinois Genetic Algorithms Laboratory Department of General Engineering University of Illinois at Urbana-Champaign 117 Transportation Building 104 South Mathews Avenue U...


Description

Linkage Learning Genetic Algorithm in C++ Fernando G. Lobo Georges R. Harik David E. Goldberg

IlliGAL Report No. 98006 April 1998

Illinois Genetic Algorithms Laboratory Department of General Engineering University of Illinois at Urbana-Champaign 117 Transportation Building 104 South Mathews Avenue Urbana, IL 61801

Linkage Learning Genetic Algorithm in C++ Fernando G. Lobo



, Georges R. Harik

y

, David E. Goldberg

flobo,gharik,[email protected]

Illinois Genetic Algorithms Laboratory Department of General Engineering University of Illinois at Urbana-Champaign April 25, 1998

1 Introduction This report tells you how to download, compile, and run the linkage learning genetic algorithm C++ code that is available at the IlliGAL web site. It also explains how to change the code so that you can try it on your own problems. The LLGA was invented by Georges Harik during his dissertation work (Harik, 1997) under the guidance of Prof. David Goldberg. You can also nd a description of the LLGA in (Lobo, Deb, Goldberg, Harik, & Wang, 1998).

2 How to download the code The code is available from ftp:==ftp-illigal.ge.uiuc.edu=pub=src=LLGA=Cpp=LLGA.tar.Z. After downloading it, uncompress and untar the le by typing: uncompress LLGA.tar.Z tar xvf LLGA.tar

At this point you should have in your directory the following les: LLGA.tar Makefile README chromosome.cpp chromosome.hpp gene.cpp

gene.hpp geneArray.cpp geneArray.hpp inputfile1 inputfile2 inputfile3

inputfile4 inputfile5 llga.cpp llga.hpp llga_io.cpp llga_io.hpp

objfunc.cpp objfunc.hpp population.cpp population.hpp random.c random.h

util.cpp util.hpp

3 How to compile the code We assume you have a C++ compiler properly installed on your computer. You shouldn't have any problems compiling the code on computers running a UNIX-like operating system. It was tested  y

Fernando Lobo is a visiting student from \Grupo de Analise de Sistemas Ambientais", FCT/UNL, Portugal. Georges Harik is currently at Silicon Graphics, Mountain View, CA.

1

on PCs running Linux, IBM RISC, Sun Sparc, and HP9000 computers. We used the GNU C++ compiler version 2.7.2.1-2. To compile the code, type make on the UNIX prompt. After that, you should have an executable le called llga.

4 How to run the code The executable llga needs two arguments: an input le and an output le. The LLGA reads its parameters from the input le and stores the results of the run in 4 output les with the following extensions: .txt, .convergence, .avgLinkage, .maxLinkage. These les contain information about the LLGA run. Five di erent input les are provided as examples with the distribution of the code. The next section gives an example of an input le and guides you through a typical LLGA run.

5 The input le In your directory you should have a le called inputfile1. The contents of this le (along with the line numbers) is shown below. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

# # inputfile1: a sample input file for the LLGA. # This is the 19 exponentially scaled BB problem that is described # in Georges Harik's dissertation (pp 89-90). # Each BB is a 4-bit trap function. # The signal is 0.4 (global is 1, deceptive is 0.6). # The BBs are scaled by a factor of 7. # BEGIN building_blocks: 19 1-19 trap 4 0.4 7 coding_genes: 76 noncoding_genes: 150 popsize: 1000 selection_operator: tournament_with_replacement selection_rate: 4 pcross: 1.0 stop_criteria: gen= 350 seed: 0.37234535 report_population: off report_best_individual: off END

The format of the input le is designed for handling problems that are completely separable, i.e., problems where the tness function is the sum of a number of independent sub-functions (building blocks). In sub-section 6.2 we explain how you can run the llga on an arbitrary function. 2

The llga skips all the lines until it reaches the word BEGIN (line 9 in the example above). Then it starts reading the parameters in a prede ned order. The program doesn't do any fancy parsing on the input le. This means that after the word BEGIN, YOU SHOULDN'T CHANGE THE ORDER OF THE LINES IN THE INPUT FILE because otherwise the llga will get totally confused. The input le is straightforward to understand. Below is an explanation of each of its lines:

line 10 says that the test problem has 19 building blocks (BBs). line 11 speci es the test problem. It says that BB 1 through BB 19 are trap functions of size 4 bits each. The di erence between the best and second best strings (signal) is 0.4 and the BBs are scaled by a factor of 7. This means that the tness contribution of the BBs will be

1, 7, 49, . . . Besides trap, you can also use tmmp, the truncated multimodal problem described in Harik's dissertation. Figure 1 and 2 show the de nition of these two functions. line 12 says that the problem length is 76. This is equivalent to the chromosome length in a SGA. line 13 says that the number of noncoding genes (introns) is 150. line 14 says that the population size is 1000. line 15 says that the selection operator is tournament selection with replacement. This parameter can be either set to tournament with replacement or tournament without replacement. line 16 says that the tournament size is 4. line 17 says that the probability of crossover is 1. line 18 says that the LLGA stops when it reaches generation 350. Besides the gen= x option, you can also choose the all or none BBs option. With this last option, the LLGA would stop when all the BBs get fully propagated or extinct from the population. line 19 says that 0.37234535 is the seed to initialize the random number generator. This must be a number between 0 and 1. line 20 says that we don't want to send the population to the .txt le at the end of each generation. You can set this option on or o . If set to on you should be careful not to run a large problem because otherwise the report le will become huge. line 21 says that we don't want to send the best individual to the .txt le at the end of each generation. You can set this option on or o . Now you're ready to go ahead and run the LLGA on this problem. Type llga inputfile1 out. Population statistics are displayed on the screen at the end of each generation. The same information is also sent to the le out.txt. Three other les are generated with data that is in an appropriate format to be read by software packages such as Gnuplot or Mathematica. These 3 les contain information about convergence, maximum linkage, and average linkage of each BB throughout the LLGA run. The le names are out.convergence, out.maxLinkage and out.avgLinkage.

3

1

fitness

1-signal

0 0

k-1

k

k-1

k

unitation (number of ones)

Figure 1: A k-bit trap function.

1

fitness

1-signal

0 0

k/2 unitation (number of ones)

Figure 2: A k-bit tmmp function.

4

5.1 Additional examples

The other sample input les that are distributed with the code are shown below. Have a look at them to see how you can easily create di erent kinds of problems. # # inputfile2: a sample input file for the LLGA. # 10 uniformly scaled BBs. # Each is a 4-bit trap function with deceptive-to-optimal ratio of 0.8 # BEGIN building_blocks: 10 1-10 trap 4 0.2 1 coding_genes: 40 noncoding_genes: 100 popsize: 300 selection_operator: tournament_with_replacement selection_rate: 4 pcross: 1.0 stop_criteria: all_or_none_BBs seed: 0.37234535 report_population: off report_best_individual: off END

# # inputfile3: a sample input file for the LLGA. # This is an experiment with a single BB. # The BB is a 6-bit truncated multimodal problem. # The optimum has fitness 1. The local optimum has fitness 1-0.2 = 0.8 # BEGIN building_blocks: 1 1 tmmp 6 0.2 1 coding_genes: 6 noncoding_genes: 100 popsize: 600 selection_operator: tournament_without_replacement selection_rate: 6 pcross: 1.0 stop_criteria: gen= 50 seed: 0.37234535 report_population: off report_best_individual: on END

5

# # inputfile4: a sample input file for the LLGA. # This is an experiment with a single 4-bit trap function. # The optimum has fitness 1. The deceptive has fitness 1-0.2 = 0.8 # BEGIN building_blocks: 1 1 trap 4 0.2 1 coding_genes: 4 noncoding_genes: 100 popsize: 100 selection_operator: tournament_with_replacement selection_rate: 4 pcross: 1.0 stop_criteria: gen= 40 seed: 0.37234535 report_population: off report_best_individual: on END

# # inputfile5: a sample input file for the LLGA. # This is an experiment with BBs of mixed size and scale. # 5 BBs, each a 4-bit trap function (exponentially scaled -- 1,2,4,8,16). # 1 6-bit tmmp function (scaled by a factor of 60). # 10 uniformly scaled BBs, each a 1-bit trap (i.e. 10 bit onemax) # BEGIN building_blocks: 16 1-5 trap 4 0.2 2 6 tmmp 6 0.2 60 7-16 trap 1 1 1 coding_genes: 36 noncoding_genes: 100 popsize: 600 selection_operator: tournament_with_replacement selection_rate: 6 pcross: 1.0 stop_criteria: all_or_none_BBs seed: 0.37234535 report_population: off report_best_individual: on END

6

We suggest that you run all the examples shown above. In particular try running the llga on inputfile4. This consists of a single 4-bit BB. At the end of the run look at the .txt le and see the linkage being formed. You will something like this:

At generation 0: - - - (0,1) - - - - - - - - - - - - - - - - - - - - - (1,1) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (3,1) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (0,0) - - - (2,1) - - - - - - - - - - - - - - - (2,0) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (1,0) - - - - - - - - - - - - - - - - - - - - - - - - - (3,0) - - - - - - - - - - - - - - - - - - - - - - - - - - (0,0) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (1,0) (3,0) - - - - - - - - - - - (2,0) - - - -

At generation 10: ------------------------------------------------------------------ - - - - - - - - - - - - - - (0,1) - - (3,1) - - - - - - (1,1) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (2,1) - - - - - - - - - - (1,0) - (3,0) - - - - (0,0) - - - - - - - - (2,0) - - - - - - - - - - - - - (1,0) - - - - - (3,0) - - (0,0) - - - - - (2,0) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ----------------------------------------------------

At generation 20: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (1,1) - - - - (0,1) (3,1) (2,1) - - (2,0) - - (1,0) (0,0) - (3,0) - - - - - - - - - - - - - - - - - - - - - - (2,0) - (1,0) - (3,0) (0,0) - - - - - - - - - ---------------------------------------------------------------------------------------------------------------------------------------

And at generation 40: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (0,1) (3,1) (1,1) (2,1) (1,0) (3,0) (0,0) (2,0) (1,0) (3,0) (2,0) (0,0) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------The '-' represents a noncoding gene.

6 About the C++ code The implementation of the LLGA doesn't use advanced features of the C++ language such as templates and inheritance. This means that you don't need to be a C++ expert in order to modify the code. A basic knowledge of the language should be sucient. Next, we give a brief description of the source les. 7

6.1 Brief description of the source les Each .cpp le has a corresponding .hpp le. The *.hpp are the header les and contain the de nitions of the various classes. The *.cpp les contain the actual implementation.

gene.cpp contains the implementation of class gene. A gene has a locus and an allele. geneArray.cpp contains the implementation of class array of genes. chromosome.cpp contains the implementation of class chromosome. A chromosome is stored in

an array of genes. The crossover operator and the linkage calculations for each BB are done here (Note: for counting the number of BBs and computing their linkages, we always assume that a BB is the string '111. . . 1'). population.cpp contains the implementation of class population. A population is an array of chromosomes. Selection operators, population statistics, and stop criterias are implemented here. objfunc.cpp contains the code for the test functions trap and tmmp. If you want to add new test functions, add them here. util.cpp contains utility functions and procedures. llga io.cpp contains input and output procedures for reading the LLGA parameters and for reporting the LLGA results to the output les. random.c is the random number generator. It is a C-translation from Goldberg's SGA random number generator in Pascal. llga.cpp contains the main() function which has the actual LLGA loop. 6.2 How to plug-in other test functions The input le is especially designed for arti cially constructed problems where the tness function is the sum of a number of independent sub-functions (completely separable problems). You might want to add other kinds of sub-functions in addition to trap and tmmp. You might also want to run the llga on an arbitrary problem, i.e., a problem where you don't know where the BBs are.

6.2.1 Adding a completely separable problem If you want to add a new sub-function called myfunc, this is what you have to do: 1. Go to line 32 of le llga io.cpp and insert the following entry into the testfunction table: f"myfunc", "this is my function", &myfuncg.

2. Edit the le objfunc.cpp and write the code for myfunc(). It must have the same parameters as trap() and tmmp(). Have a look at how these are implemented. Don't forget to also declare the name of myfunc in the le objfunc.hpp.

8

6.2.2 Adding an arbitrary problem If you want to run the llga on an arbitrary problem, i.e., a problem where you don't know where the BBs are, then do the following: 1. Go to line 32 of le llga io.cpp and insert the following entry into the testfunction table: f"myfunc", "this is my function", &myfuncg. 2. Edit the le objfunc.cpp and rewrite the code for objfunc(). In the input le specify that the problem has a single BB with the same size as the problem length (see example below). Of course the linkage and convergence of the BB will be meaningless in this case. In the input le, specify the objective function like: building_blocks: coding_genes: ... ...

1 1 myfunc 20 ... ...

20 999 999

7 Important implementation detail There is an important implementation detail that is not very clear in Harik's dissertation. 

During crossover, we have to make sure that the cross points fall into a noncoding gene (intron). If it doesn't, we toss another random number. The same thing applies for the grafting point.

8 Final comments This code is distributed for research purposes. It has no warranty implied or given, and the authors assumes no liability for damage resulting from its use or misuse. If you have any comments or nd any bugs please send an email to [email protected].

Acknowledgments We thank Kalyanmoy Deb. This report was written while the rst author was visiting his lab, the Kanpur Genetic Algorithms Laboratory at the Indian Institute of Technology. This study was sponsored by the Air Force Oce of Scienti c Research, Air Force Materiel Command, USAF, under grants F49620-94-1-0103, F49620-95-1-0338, and F49620-97-1-0050. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the ocial policies or endorsements, either expressed or implied, of the Air Force Oce of Scienti c Research or the U. S. Government. Fernando Lobo was sponsored by \Junta Nacional de Investigaca~o Ci^ent ca e Tecnologica (Portugal)" through scholarship PRAXIS BD4020/94. 9

References Harik, G. R. (1997). Learning gene linkage to eciently solve problems of bounded diculty using genetic algorithms. Doctoral dissertation, University of Michigan, Ann Arbor. Also IlliGAL Report No. 97005. Lobo, F. G., Deb, K., Goldberg, D. E., Harik, G. H., & Wang, L. (1998). Compressed introns in a linkage learning genetic algorithm. In Proceedings of the Symposium on Genetic Algorithms SGA-98. in Press. Earlier version available as IlliGAL Report No. 97010.

10 View publication stats...


Similar Free PDFs