Р. Г. Стронгина. Ниж- ний Новгород: Изд-во Нижегородского университета, 2002, 217 с



Pdf көрінісі
бет135/151
Дата26.01.2022
өлшемі1,64 Mb.
#24342
түріСеминар
1   ...   131   132   133   134   135   136   137   138   ...   151
Байланысты:
Seminar 1

n, are chosen according to the rule: 
Step 1. Points of the set 
 
X
k
 = {x
l
,...,x
k
}
∪{a}∪{b} (7) 
 


190 
including the boundaries of the interval [ab] and coordinates x
j
, 1 
≤ j ≤ k, 
of previous trials, where  
 
 
k = k(Q–1) = p(l) + . . . + p(Q–1) 
 
are renumbered (by subscripts) in the order of increasing the coordinates 
 
 
a = x
0
 
x
i
 < . . . < x
τ
 
b  
 
where 
τ+1 = τ(Q)+1 is quantity of different elements of the set X
k
 from (7); 
Step 2. A real number R(i)  is assigned to each subinterval (x
i–1
,  x
i
), 

≤ i ≤ τ,
 
where R(i) is called the characteristic of this subinterval; 
Step 3. Characteristics R(i)  of the subintervals (x
i–1
,  x
i
), 1 
≤ i ≤ τ, are 
ordered by decreasing 
 
 
R(i
1

≥ R(i
2

≥ … ≥ R(i
τ
); (8) 
 
Step 4. The next trials of the Q-th iteration are performed in parallel 
on processors at the points of the set 
 
T(Q) = {x
k+1
, … , x
k+p
} 
where 
 
x
k+q
 = S(i
q

 
and i
q
, 
≤ q ≤ p, are the first indices from the series (8) and the function 
is such that 
 
)
,
(
q
i
q
i
q
k
x
x
x

+

In this case it is supposed that 
 
 
p = p(Q
≤ τ,  Q > n
 
Many sequential methods proposed for solving the one-dimensional 
problems (4)–(5) and (4)–(6) can be generalized to the parallel case by us-
ing the scheme introduced above 
Let us consider a parallel characteristical algorithm (PA) and its 
sequential analog (SA) obtained by setting p = 1. Efficiency of PA in com-
parison with SA strongly depends on the number p > 1 of parallel proces-
sors used for parallelization. By increasing we try to obtain speed up in 
times. It is possible only if the number of trials executed by PA is equal to 
the number of trials done by SA. But these numbers can be different be-
cause of the following reason. 
In SA the decision about the choice of a new trial point x
k+1
 is based on 
the complete information 
 


 
191 
 
w
k
 = {x
l
,…, x
k
z
l
,…, z
k
z
l
,…, z
k

 
available after executing a trial at the point x
k
where z
i
 = 
φ(x
i
), z
i
 = 
φ′(x
i
), 

≤ i ≤ k.  At the same time in PA p  points are chosen on the base of the 
same information w
k
 and the data 
 
 
{x
k+l
,…, x
k+j–1
z
 k+l
,…, z
k+j–1
z
 k+l
,…, z
k+j–1

 
are absent at the moment of the choice of x
k+j
, 2 
≤ j ≤ p. Thus, the choice of 
the points x
k+2
,…,  x
k+p
  by PA is not optimal (from the informative view-
point) in comparison with SA. This fact can cause appearance of redundant 
(with respect to the sequential search) points in the sequence {y
u
}  of the 
trial points produced by PA and to slow down the search. To describe the 
quantity of such points we introduce the redundancy coefficient 
 
 
K(un) = K′ (un)/(– n), 
 
where and 
 
 
K′(un) = card({y
n+1
,…, y
n
}\{x
k
}). 
 
Here, {x
k
} is the sequence of trial points produced by SA. K′(un) is the 
number of redundant points from {y
u
} starting from the (n + l)-th to the u-th 
trial. This definition requires fulfillment of the inclusion {x
k

⊂ {y
u
}.  The 
case {x
k
} = {y
u
} is the best situation with K(u, n) = 0: the redundant points 
have not been produced. This fact implies speed up equal to the number of 
processors used for parallelization. 
On the basis of some additional information about structure of the ob-
jective function it is possible to obtain theoretical estimates of K(u, n) and to 
establish cases where K(u, n) is bounded ensuring high levels of speed up. 
Wide numerical experiments executed on the parallel computers illustrate 
performance of the parallel characteristical methods and confirm theoretical 
results. 


192 
HIGH PERFORMANCE COMPUTING : TOOLS AND APPLICATIONS 
G. Spezzano and D. Talia 
ISI-CNR, Via P. Bucci cubo 41C, 87036 Rende, Italy 
Abstract 
Computers are more and more fast and powerful, however they are used 
to solve larger and more challenging problems, thus available computing 
power is never sufficient. For this reason, high performance computing ar-
chitectures have been designed in the past twenty years from vector super-
computers to massively parallel parallel computers and cluster computing 
systems. Both in scientific and commercial computing, high performance 
computers are required to manage complex tasks and store and analyze the 
flood of data and information today available. This paper discusses research 
activities in the area of high performance computing at ISI-CNR. In particu-
lar, tools and applications in the areas of parallel cellular automata, parallel 
data mining, evolutionary algorithms, and grid computing are introduced 
and discussed. 
1. Introduction 
Computers are ever more fast and powerful, but their computing power 
is never considered sufficient because when a new machine faster than the 
previous ones is developed it is used to solve larger and more challenging 
problems. For this reason, high performance computing architectures have 
been designed in the past twenty years starting from vector supercomputers 
to massively parallel parallel computers and cluster computing systems. 
Both in scientific and commercial computing, high performance computers 
are effectively used to manage complex tasks and store and analyze the 
flood of data and information today available. 
The demand for computing both in terms of volume of information and 
the variety of processing is ever increasing. Many previously infeasible sys-
tems are expected naturally today. Basic physical limitations on electronic 
circuits have started to force high performance computations to be targeted 
toward the exploitation of parallelism. Just as the fastest cycle times are 
approaching the fundamental barriers imposed by physical device limita-
tions, different kinds of parallel and distributed systems are emerging. It's 


 
193 
only a matter of time before this trend will affect general purpose comput-
ing. A step in that direction can be represented by cluster computers built 
using off-the-shelf components based on PC technology. 
Parallel computing systems more and more demonstrate their effective-
ness as tools that deliver high-performance in solving complex problems. 
However, often, parallel application developers are concerned with low-
level issues, such as synchronization, process mapping, and architecture 
details, when they use parallel computers. These issues mainly arise from 
the different architectural features that characterize parallel computers. 
These architectures are a class of systems including many different architec-
tures, from single-instruction-multiple-data (SIMD) machines to distrib-
uted-memory, multiple-instruction-multiple-data (MIMD) computers and 
workstation clusters. This architectural heterogeneity complicates the task 
of designers of parallel applications and leads to the development of parallel 
software that lacks portability. Parallelism, either at the architecture or algo-
rithm level, is a significant departure from traditional sequential computa-
tion. The sequential approach attempts to reduce processing time by carry-
ing out each operation faster and by eliminating redundancies in computa-
tion. In parallel computing, we save time by identifying concurrently execu-
table subtasks in a given task and executing them on multiple processors 
simultaneously. The first step in the development of a parallel program is 
the identification of sections that can be executed in parallel and how these 
should be parallelized. But, unlike the von Neumann model for sequential 
computing, there is no general purpose model for parallel computation thus 
different approaches must be followed depending on the parallel model and 
target architecture. 
Languages, tools, and environments for parallel computing are very 
critical for the development of high-performance applications on parallel 
machines. They allows users to exploit the computational power of scalable 
computer systems in solving very demanding applications that have large 
computation and memory requirements. Among several critical research 
areas in high-performance computing, we discuss here some of those that 
are today very interesting and promising and in a near future will play a 
major role in scientific and commercial applications of parallel computers. 
These areas concern with 
high-performance simulation of complex systems based on cellular 
automata, knowledge discovery on large data sets using parallel data mining 


194 
algorithms, parallel genetic programming and grid computing. These are 
some research topics that are investigated at ISI-CNR in the area of high 
performance computing. 
2. Cellular Automata and Computational Simulation 
Cellular Automata (CA) are discrete dynamical systems and, at the 
same time, are parallel computing abstract models. For this reason CA can 
be efficiently implemented on parallel computers. When CA are executed 
on a parallel computer, they allow a user to exploit their inherent, parallel-
ism to support the efficient simulation of complex systems that, can be 
modeled by a very large number of simple elements (cells) with local inter-
action only. In the last decade, several years after their definition, with rapid 
advances in development of high performance architectures, CA have in-
creasingly utilized for more general computer simulation and modeling. The 
CA model can be effectively used both as a realistic approach to define ab-
stract parallel machines and as a programming methodology for computa-
tional science on parallel computers. 
From the marriage of cellular automata and MIMD parallel computing 
was born CAMELot (Cellular Automata environment for systems modeling 
open technology), a software environment based on the cellular automata 
model which has been originally implemented on a transputer-based multi-
computer and now is available on a large set of parallel architectures (from 
massively parallel machine to PC clusters). CAMELot is a tool designed to 
support the development of high-performance applications in science and 
engineering. It offers the computing power of a parallel computer although 
hiding the architecture issues to a user. It provides an interface which allows 
a user to dynamically change the parameters of the application during its 
running, and a graphical interface which shows the application output. 
CAMELot has been successfully used for solving many problems such as 
the simulation of the lava flows and other fluid flow models, for image 
processing, freeway traffic simulation, and soil bioremediation. The results 
achieved in terms of thoroughness of simulation and execution speedup 
show that CAMELot might be used for simulation and modeling of real 
systems in many areas in a simple way and achieving high performance. 
The system provides a high-level programming language, called 
CARPET, for programming cellular algorithms. The main features of 
CARPET are the possibility to describe the state of a cell as a set of typed 


 
195 
substates and the definition of complex neighborhoods, that can be also time 
dependent, in a discrete cartesian space. The portable version of CAMELot 
has been developed in the COLOMBO (Parallel COmputers improve cLean 
up of sOils by Modelling BiOremediation) project within the ESPRIT 
framework [26]. 
The main goal of CAMELot is to integrate computation, visualization 
and control into a single environment that allows interactive steering of sci-
entific applications. CAMEL consists of : 
•  a parallel run-time support for the CARPET language; 
•  a graphical user interface (GUI) for editing, compiling, configuring, 
executing and steering the computation ; 
•  a tool for the interactive visualization of the results. 
 
The run-time support is implemented as a SPMD (Single Program, 
Multiple Data) program. The latest implementation is based on the C lan-
guage plus the standard MPI library and can be executed on different paral-
lel machines such as the Meiko CS-2
:
 the CRAY T3E and a cluster of work-
stations. The parallel program that implements the architecture of the sys-
tem is composed by set. of macrocell processes, a controller process and a 
GUI process. Each macrocell process, that operates on a strip of cells of the 
CA, runs on a single processing element of the parallel machine and exe-
cutes the updating of the state of cells belonging to its partition. The syn-
chronization of the automaton and the execution of the commands, provided 
by a user through the GUI interface, are carried out by the controller proc-
ess. MPI primitives handle all the communications among the processes 
using MPI communicators. 
To improve the performance of applications that have a diffusive be-
havior such as CFD, CAMELot implements a load balancing strategy for 
mapping lattice partitions on the processing elements [27]. This load bal-
ancing strategy is a domain decomposition technique similar to the scattered 
decomposition technique. By the CAMELot GUI, a user can assign colors 
to cell substates values and visualize on the fly, on different windows, the 
substates values, thus she/he can follow the evolution of the simulation and 
steer the computation. CAMELot provides the development environment 
and the runtime system for the parallel execution of CARPET programs. 
CARPET (CellulAR Programming EnvironmenT) is a high-level lan-
guage related to C with some additional constructs to describe the rules of 


196 
the transition function of a single cell of a cellular automaton [28]. 
CARPET supports researchers in the development of models of discrete 
dynamic systems by a set of constructs that allow to translate a model ex-
pressed in the CA formalism into a cellular algorithm. A CARPET pro-
grammer can define both standard CA and generalized CA. It describes the 
rules of the state transition function of a single cell of a cellular automaton. 
CARPET defines the structure of CA as a lattice in a discrete Cartesian 
space in which the maximum number of dimensions allowed is equal to 3. 
Each dimension is wrapped around, e.g. a two-dimensional lattice forms a 
torus. Border functions can be used to disable the wrap-around. The size of 
the lattice is defined by the graphical user interface (GUI) of the CAMELot 
system and is only bounded by the amount of available memory. The radius 
of the neighborhood of a cell is statically defined and is strictly connected to 
the dimension of an automaton. It allows a user to define the maximum 
numbers of cells which can compose the neighborhood of a cell. 
To improve the readability of a program and extend the range of the ap-
plications that can be coded as cellular algorithms, the state of a cell in 
CARPET is structured as a record in which the C basic types : char, shorts, 
integers, floats, doubles and mono-dimensional arrays of these types can be 
used to define the substates that represent the physical quantities of a sys-
tem. Furthermore, the substates are also used to store values that specify 
statistical properties of a variable or to hold a history of a substate. The pre-
defined variable cell refers to the current cell in the d-dimensional space 
under consideration. A substate can be referred appending to the reserved 
word  cell  the substate's name by the underscore symbol '_' (i.e, cell-
substate)CARPET generalizes the concept of neighborhood and allows a 
user to define a logic neighborhood that describes both regular neighbor-
hood (i.e., von Neumann, Moore) and asymmetrical neighborhoods with 
different topological properties (i.e., hexagonal) that can be time-dependent. 
A neighborhood is defined associating it the name of a vector whose ele-
ments compose the logic neighborhood and defining a name for each neigh-
boring cell. Both the name of the vector and names of neighbor cells can be 
used to refer to the value of a cell substate (i.e., identifier[i]_substate, 
North_substate). Finally, global parameters can be defined in CARPET to 
model global characteristics (e.g., porosity of a soil, number of Reynolds) of 
a complex system. As an example, figure 1 shows the declaration part of a 
CARPET program that defines a two-dimensional automaton. The radius is 


 
197 
equal to 1, the state is composed of two integer substates and one float sub-
state. The Moore neighborhood and two global parameters are defined. 
These declarations are contained into a cadef section at the beginning of a 
CARPET program. 
 
cadef 

dimension 2; 
radius 1 ; 
state ( int water, humidity ; float pollution) ; 
neighbor Moore [8] ( [0,–1] North,  [1,–1] NorthEast,  [1,0] East, 
[1,1] SouthEast,  [0,1] South, [–1,1] SouthWest , 
[–1,0] West, [–1,–1] NorthWest); 
parameter (porosity 0.003, permeability 0.4); 

 
Fig. 1. The CA declaration part 
 
The new value of the substate does take effect from the next iteration. 
To allow a user to define spatially inhomogenous CA, CARPET defines the 
GetX, GetY, and  GetZ  operations that return the value of the X, Y and Z 
coordinate of a cell in the automaton. Furthermore, by means of the prede-
fined variable step,  that contains the number of iterations that have been 
executed, the language supports the implementation of neighborhoods and 
transition functions that are time-dependent. The example in figure 2 de-
scribes how the CARPET constructs are used to implement the classical life 
program in a simple and very expressive way. In particular, the update 
statement assigns a new value to the cell state. 
CAMELot has been used for the implementation of a large number of 
applications in different areas. In the scientific and engineering area have 
been developed simulations of lava flow [29], freeway traffic [30], land-
slides [31], soil bioremediation [32], gas diffusion, and ising. Applications 
of artificial life developed by CAMELot are ant colony, flocking and 
Greenberg-Hastings model [33]. Finally, CAMELot has been used to solve 
artificial intelligence problems by implementing cellular genetic algorithms. 
In the [34] is given an outline of a few of these applications. 
 


198 
#define alive 1 
#define dead 0 
cadef { 
dimension 2; 
/* bidimensional lattice */ 
radius 1; 
state (short life); 
neighbor Moore[8]([0,–1] North, [–1,–1] NorthWest, [–1,0] West,  
[–1,1]Southwest, [0,1]South, [1,1] SouthEast,  
[1,0] East, [1,–1] NorthEast );  

int i; short sum; 

/* computing the number of alive neighbors of cell (x,y) */  
for (i = 0; i < 8; i++)  
sum = Moore_life[i] + sum; 
/* computing the new state of cell (x,y) */  
if ((sum == 2 && cell_life ==1) || sum == 3) 
update (cell_life,alive);  
else 
update (cell_life,dead); 

 
Fig. 2. The game of life algorithm in CARPET.  
3. Parallel Genetic Programming 
Genetic programming (GP) is a powerful tool to solve problems coming 
from different application domains such as hard function and combinatorial 
optimization, neural nets evolution, routing, planning and scheduling, man-
agement and economics, machine learning, data mining, robotics and pat-
tern recognition. 
GP finds its origin and inspiration from genetic algorithms (GAs) [16] 
are a class of adaptive general-purpose search techniques inspired by natural 
evolution. A standard GA works on a population of elements (chromo-
somes),  generally encoded as bit strings, representing a randomly chosen 
sample of candidate solutions to a given problem. A GA is an iterative pro-
cedure that maintains a constant-size population of individuals. Each mem-
ber of the population is evaluated with respect to a fitness function. At each 


 
199 
step a new population is generated by selecting the members that have the 
best fitness. In order to explore all the search space, the algorithm alters the 
selected elements by means of genetic operators to form new elements to be 
evaluated. The most common genetic operators are crossover and mutation. 
Crossover combines two strings to produce new strings with bits from both, 
thereby producing new search points. Through crossover the search is bi-
ased towards promising regions of the search space. Mutation flips string 
bits at random if a probability test is passed; this ensures that, theoretically, 
every portion of the search space is explored. 
GP is a variation of genetic algorithms in which the evolving individu-
als are themselves computer programs instead of fixed length strings from a 
limited alphabet of symbols [19]. Programs are represented as trees with 
ordered branches in which the internal nodes are functions  and the leaves 
are so-called terminals of the problem. Thus, the user of GP has to make an 
intelligent choice of set of functions and terminals for the problem at hand. 
GP conducts its search in the space of computer programs. 
The GP approach evolves a population of trees by using the genetic op-
erators of reproduction, recombination and mutation. Each tree represents a 
candidate solution to a given problem and it is associated with a fitness 
value that reflects how good it is, with respect to the other solutions in the 
population. The reproduction operator copies individual trees of the current 
population into the next generation with a probability proportionate to their 
fitness (this strategy is also called roulette wheel selection scheme). The 
recombination operator generates two new individuals by crossing two trees 
randomly chosen nodes and exchanging the subtrees. The two individuals 

Достарыңызбен бөлісу:
1   ...   131   132   133   134   135   136   137   138   ...   151




©emirsaba.org 2024
әкімшілігінің қараңыз

    Басты бет