Р. Г. Стронгина. Ниж- ний Новгород: Изд-во Нижегородского университета, 2002, 217 с


participating in the crossover operation are again selected proportionate to



Pdf көрінісі
бет136/151
Дата26.01.2022
өлшемі1,64 Mb.
#24342
түріСеминар
1   ...   132   133   134   135   136   137   138   139   ...   151

participating in the crossover operation are again selected proportionate to 
fitness. The mutation operator replaces one of the nodes with a new ran-
domly generate subtree. Koza [19] typically does not use a mutation opera-
tor in his applications; instead he uses initial populations that are presuma-
bly large enough to contain a sufficient diversity. Thus, the success of GP 
often depends on the use of a population of sufficient size. The choice of the 
size is determined by the level of complexity of the problem. When applied 
to large hard problems GP performance may thus drastically degrade be-
cause of the computationally intensive task of fitness evaluation of each 
individual in the population. 
Fortunately, genetic programming is well suited to be implemented on 
parallel architectures because the population can be distributed across the 


200 
nodes of the system [18]. One of the main problems in parallelising GP 
comes from the global selection of individuals, proportionate to their fit-
ness, both in the reproduction and recombination steps. This kind of selec-
tion forces the sharing of the new solutions until the new population can be 
chosen. Two main approaches to parallel implementations of GP have been 
proposed to avoid this bottleneck. The island  model [21] and the cellular 
model [23]. 
The island model divides the population into smaller subpopulations. A 
standard genetic programming algorithm works on each partition, and is 
responsible for initializing, evaluating and evolving its own subpopulation. 
The standard GP algorithm is augmented with a migration operator that 
periodically exchanges individuals among the subpopulations. 
In the cellular model each individual is associated with a spatial loca-
tion on a low-dimensional grid. The population is considered as a system of 
active individuals that interact only with their direct neighbours. Different 
neighbourhoods can be defined for the cells. The most common neighbour-
hoods in the two-dimensional case are the 4-neighbour (von Neumann 
neighbour) consisting of the North, South, East, West and 8-neighbour 
(Moore neighbourhood) consisting of the same neighbours augmented with 
the diagonal neighbours. Fitness evaluation is done simultaneously for all 
the individuals and selection, reproduction and mating take place locally 
with the neighbourhood. Information slowly diffuses across the grid giving 
rise to the formation of semi-isolated niches of individuals having similar 
characteristics. The choice of the individual to mate with the central indi-
vidual and the replacement of the latter with one of the offspring can be 
done in several ways. 
Folino, Pizzuti and Spezzano developed a tool for parallel genetic ap-
plications, called CAGE (CellulAr GEnetic programming tool) [15] , that 
realizes a fine-grained parallel implementation of genetic programming on 
distributed-memory parallel computers. Our implementation is based on the 
cellular model. The cellular model is fully distributed with no need of any 
global control structure and is naturally suited to be implemented on parallel 
computers. It introduces fundamental changes in the way GP works. 
In the model, the individuals of the population are assigned to a specific 
position in a large toroidal two-dimensional grid and the selection and mat-
ing operations are performed, cell by cell, only over the individual assigned 
to a cell and its neighbors. This local reproduction has the effect of intro-


 
201 
ducing an intensive communication among the individuals. The massive 
communication provides a way to disseminate good solutions across the 
entire population but influences negatively the performance of the parallel 
implementation of the GP. Moreover, unlike genetic algorithms, where the 
size of individuals is fixed, the genetic programs are individuals of varying 
sizes and shapes. This requires a large amount of local memory and intro-
duces an unbalanced computational load per grid point. Therefore, both an 
efficient representation of the program trees must be adopted and a load 
balancing algorithm must be employed to maintain the same computational 
load among the processing nodes. 
The best way to overcome the problems associated with the implemen-
tation of the cellular model on a general purpose distributed-memory paral-
lel computer is that to use a partitioning technique based upon domain de-
composition in conjunction with the Single-Program-Multiple-Data 
(SPMD) programming model. According to this model, an application on N 
processing elements (PEs) is composed of N similar processes, each one 
mapped on a PE that operates on a different set of data. For an effective 
implementation, data should be partitioned such that communication takes 
place locally and the computation load is shared among the PEs in a bal-
anced way. This approach increases the granularity of the cellular model 
transforming it from a fine-grained model to a coarse-grained model. In 
fact, instead of assigning only one individual to a processor, the individuals 
are grouped by slicing up the grid and assigning a slice of the population to 
a node. 
CAGE implements the cellular GP model using a one-dimensional do-
main decomposition (in the direction) of the grid that contains the popula-
tion of the genetic programs and by using explicit message passing to ex-
change information between these domains. This decomposition is more 
efficient than a two-dimensional decomposition, since in the 2-D case twice 
the number of messages are sent, which reduces the performance since mes-
sage startup times are rather long on the current commercially available 
parallel machines. 
The concurrent program which implements the architecture of CAGE is 
composed by a set of identical slice  processes. No coordinator process is 
necessary because the computation model is completely decentralized. Each 
slice process, which contains a strip of elements of the grid, runs on a single 
processing element of the parallel machine and executes on each grid point 


202 
the code, show in figure 3, that updates all the individuals of the sub-
population. 
 
1. Read from a file the configuration parameters 
2. Generate a random sub-population 
3. Evaluate the individuals of the sub-population 
4. while not numGenerations do 
5.   update boundary data 
6.   for x =1 to length 


Достарыңызбен бөлісу:
1   ...   132   133   134   135   136   137   138   139   ...   151




©emirsaba.org 2024
әкімшілігінің қараңыз

    Басты бет