Genetics, Vol. 157, 1021-1043, March 2001, Copyright © 2001

Parallel Computation of a Maximum-Likelihood Estimator of a Physical Map

Suchendra M. Bhandarkara, Salem A. Machakaa, Sanjay S. Sheteb, and Raghuram N. Kotaa
a Department of Computer Science, The University of Georgia, Athens, Georgia 30602-7404
b Department of Statistics, The University of Georgia, Athens, Georgia 30602-7404

Corresponding author: Suchendra M. Bhandarkar, Department of Computer Science, The University of Georgia, 415 Boyd Graduate Studies Research Ctr., Athens, GA 30602-7404., suchi{at}cs.uga.edu (E-mail)

Communicating editor: J. ARNOLD


*  ABSTRACT
*TOP
*ABSTRACT
*The physical mapping protocol
*MATERIALS AND METHODS
*MATHEMATICAL FORMULATION OF THE...
*PARALLEL COMPUTATION OF THE...
*EXPERIMENTAL RESULTS
*CONCLUSIONS AND DISCUSSION
*LITERATURE CITED

Reconstructing a physical map of a chromosome from a genomic library presents a central computational problem in genetics. Physical map reconstruction in the presence of errors is a problem of high computational complexity that provides the motivation for parallel computing. Parallelization strategies for a maximum-likelihood estimation-based approach to physical map reconstruction are presented. The estimation procedure entails a gradient descent search for determining the optimal spacings between probes for a given probe ordering. The optimal probe ordering is determined using a stochastic optimization algorithm such as simulated annealing or microcanonical annealing. A two-level parallelization strategy is proposed wherein the gradient descent search is parallelized at the lower level and the stochastic optimization algorithm is simultaneously parallelized at the higher level. Implementation and experimental results on a distributed-memory multiprocessor cluster running the parallel virtual machine (PVM) environment are presented using simulated and real hybridization data.


GENERATION of entire chromosomal maps has been a central problem in genetics right from its early years (STURTEVANT 1913 Down). These maps are central to the understanding of the structure of genes, their function, their transmission, and their evolution. Recent advances in molecular genetics have led to a wealth of DNA markers along a chromosome and also eased the process by which these markers can be assayed. Consequently, the focus of current research has shifted from data collection to the computational problem of map assembly.

Chromosomal maps fall into two broad categories—genetic maps and physical maps. Genetic maps represent an ordering of genetic markers along a chromosome where the distance between two genetic markers is related to their recombination frequency. Genetic maps are typically of low resolution, i.e., 1–10 Mb (LANDER and GREEN 1987 Down). While genetic maps enable a scientist to narrow the search for genes to a particular chromosomal region, it is a physical map that ultimately allows the recovery and molecular manipulation of genes of interest. A physical map is defined as an ordering of distinguishable (i.e., sequenced) DNA fragments called clones or contigs by their position along the entire chromosome where the clones may or may not contain genetic markers. The physical mapping problem is therefore one of reconstructing the order of clones and determining their position along the chromosome. A physical map has a much higher resolution than a genetic map of the same chromosome, i.e., 10–100 kb (BRODY et al. 1991 Down). Physical maps have provided fundamental insights into gene development, gene organization, chromosome structure, recombination, and the role of sex in evolution and have also provided a means for the recovery and direct molecular manipulation of genes of interest (PRADE et al. 1997 Down).

Several techniques exist for generation of physical maps from contig libraries. These techniques are specific to an experimental protocol and the type of data collected, for example, mapping by nonunique probes (ALIZADEH et al. 1995 Down), mapping by unique probes (ALIZADEH et al. 1994 Down; GREENBERG and ISTRAIL 1995 Down; JAIN and MYERS 1997 Down), mapping by unique endprobes (CHRISTOF et al. 1997 Down), mapping using restriction fragments (FASULO et al. 1997 Down; JIANG and KARP 1997 Down), mapping using radiation-hybrid data (BEN-DOR and CHOR 1997 Down; SLONIM et al. 1997 Down), and optical mapping (MUTHUKRISHNAN and PARIDA 1997 Down; KARP and SHAMIR 1998 Down; LEE et al. 1998 Down). Likewise, several computation techniques based on deterministic optimization and stochastic optimization have been reported in the literature in the context of physical mapping. Examples of stochastic optimization algorithms include simulated annealing (CUTICCHIA et al. 1992 Down, CUTICCHIA et al. 1993 Down; MOTT et al. 1993 Down; ALIZADEH et al. 1994 Down, ALIZADEH et al. 1995 Down) and the random cost algorithm (WANG et al. 1994A Down) whereas those of deterministic optimization algorithms include linear programming (JAIN and MYERS 1997 Down), integer programming (CHRISTOF et al. 1997 Down), integer linear programming with polyhedral combinatorics (CHRISTOF and KECECIOGLU 1999 Down), and semidefinite programming (CHOR and SUDAN 1995 Down). Various statistical analyses of the aforementioned physical mapping techniques have also been reported in the literature (LANDER and WATERMAN 1988 Down; ARRATIA et al. 1991 Down; ZHANG and MARR 1993 Down; BALDING 1994 Down; NELSON and SPEED 1994 Down; XIONG et al. 1996 Down; WILSON et al. 1997 Down).


*  The physical mapping protocol
*TOP
*ABSTRACT
*The physical mapping protocol
*MATERIALS AND METHODS
*MATHEMATICAL FORMULATION OF THE...
*PARALLEL COMPUTATION OF THE...
*EXPERIMENTAL RESULTS
*CONCLUSIONS AND DISCUSSION
*LITERATURE CITED

The physical mapping protocol essentially determines the nature of clonal data and the probe selection procedure. The physical mapping protocol adhered to in this project is the one based on sampling without replacement (FU et al. 1992 Down). This protocol has been used successfully in physical mapping of Aspergillus nidulans (BRODY et al. 1991 Down; PRADE et al. 1997 Down), Schizosaccharomyces pombe (MIZUKAMI et al. 1993 Down), and Pneumocystis carinii (ARNOLD and CUSHION 1997 Down) and is currently being used in physical mapping projects of A. flavus and Neurospora crassa (AIGN et al. 2001 Down; KELKAR et al. 2001 Down) under the Fungal Genome Initiative (ARNOLD 1997 Down; BENNETT 1997 Down).

The protocol that generates the probe set and the clone set is an iterative procedure that can be described as follows. Let i and i be the clone set and the probe set, respectively, at the ith iteration. The initial clone set 0 consists of all the clones in the library whereas the initial probe set 0 = {phi}. The clones in 0 are designed to be of the same length and to be overlapping so that each clone samples a fragment of the chromosome and the coverage of the entire chromosome is made possible. At the ith iteration a clone c is chosen at random from i and added to i. Clone c is hybridized to all the clones in i. The subset of clones c that hybridize to clone c are removed from i so that i+1 = i - c. Note that c {isin} c since a clone hybridizes to itself. The hybridization experiment entails extracting complementary DNA from both ends of a probe, washing the DNA over the arrayed plate, and recording all clones in the library to which the DNA sticks (i.e., hybridizes). The above procedure is halted at the kth iteration when k = {phi}. The final probe set is given by = k and the clone set by = 0 - k. In the absence of errors, the probe set represents a maximal nonoverlapping subset of 0 since any clone that overlaps with a given probe would hybridize to one end of that probe and be effectively removed from consideration in future iterations of the aforementioned iterative procedure. Fig 1 depicts the probes and clones along the length of a chromosome where the clones and probes are numbered from left to right.



View larger version (11K):
In this window
In a new window
Download PPT slide
 
Figure 1. An example of clone-probe ordering along a chromosome.

The clone-probe overlap pattern is represented in the form of a binary hybridization matrix H where the matrix element Hij denotes the hybridization of the ith clone {isin} to the jth probe {isin} . The matrix element Hij = 1 if the ith clone {isin} hybridizes to the jth probe {isin} and Hij = 0 otherwise. Table 1 shows the clone-probe hybridization data in the absence of errors resulting from the example depicted in Fig 1. If the probes in were ordered with respect to their position along a chromosome, then by selecting from H a common overlapping clone for each pair of adjacent probes in a minimal set of clones and probes that covers the entire chromosome (i.e., a minimal tiling) could be obtained. Note that a common overlapping clone between two adjacent probes would hybridize to both probes. The minimal tiling in conjunction with the sequencing of each individual clone/probe in the tiling and a sequence assembly procedure that determines the overlaps between successive sequenced clones/probes in the tiling (KECECIOGLU and MYERS 1995 Down) could then be used to reconstruct the DNA sequence of the entire chromosome.


 
View this table:
In this window
In a new window

 
Table 1. An example of clone-probe hybridization data in the absence of errors

In reality, the hybridization experiments are rarely error free. The hybridization matrix H could be expected to contain false positives and false negatives. The matrix element Hij would be a false positive if Hij = 1 (denoting hybridization of the ith clone with the jth probe) when in fact Hij = 0. Conversely, Hij would be a false negative if Hij = 0 when in fact Hij = 1. Other sources of error include chimerism wherein a single clone samples two or more distinct regions of a chromosome, deletions wherein certain regions of the chromosome are not sampled during the cloning process, and repeats wherein a clone samples a region of the chromosome with repetitive DNA structure. In this article, we confine ourselves to errors in the form of false positives and false negatives. Since the clones (and probes) in the mapping projects that use the aforementioned protocol are generated using cosmids, which makes them sufficiently small (~40 kb), chimerism and deletions do not pose a serious problem. However, repeats do pose a problem but are not explicitly addressed here; rather they are treated as multiple isolated incidences of false positives.

In this article we present a maximum-likelihood (ML) estimator (SHETE 1998 Down; KECECIOGLU et al. 2000 Down) that determines the ordering of probes in the probe set and the interprobe spacings under a probabilistic model of hybridization errors consisting of false positives and false negatives. The estimation procedure involves a combination of discrete and continuous optimization, where determining the probe ordering entails discrete (i.e., combinatorial) optimization, whereas determining the interprobe spacings for a particular probe ordering entails continuous optimization. We propose a two-level parallelization strategy for efficient implementation of the above estimator. The upper level consists of parallel discrete optimization using simulated annealing or microcanonical annealing whereas the lower level consists of the parallel conjugate gradient descent procedure. The resulting parallel algorithms are implemented on a distributed-memory multiprocessor cluster using the parallel virtual machine (PVM) environment (SUNDERAM 1990 Down; GEIST et al. 1994 Down). Convergence, speed-up, and scalability characteristics of the parallel algorithms are examined and discussed.


*  MATERIALS AND METHODS
*TOP
*ABSTRACT
*The physical mapping protocol
*MATERIALS AND METHODS
*MATHEMATICAL FORMULATION OF THE...
*PARALLEL COMPUTATION OF THE...
*EXPERIMENTAL RESULTS
*CONCLUSIONS AND DISCUSSION
*LITERATURE CITED

Cosmid libraries used to construct the physical map of N. crassa discussed in this article are described in KELKAR et al. 2001 Down. The physical mapping data were generated by DNA hybridization described in KELKAR et al. 2001 Down. Assignments of markers to physical and genetic maps was achieved by complementation, hybridization, and cosmid end sequencing as described in KELKAR et al. 2001 Down.


*  MATHEMATICAL FORMULATION OF THE ML ESTIMATOR
*TOP
*ABSTRACT
*The physical mapping protocol
*MATERIALS AND METHODS
*MATHEMATICAL FORMULATION OF THE...
*PARALLEL COMPUTATION OF THE...
*EXPERIMENTAL RESULTS
*CONCLUSIONS AND DISCUSSION
*LITERATURE CITED

In this section we present a brief synopsis of the ML estimator proposed in KECECIOGLU et al. 2000 Down and SHETE 1998 Down. The estimator reconstructs the ordering of probes in the probe set and the interprobe spacings under a probabilistic model of hybridization errors consisting of false positives and false negatives.

The probe ordering problem can be formally stated as follows. Given a set = {P1, P2, ... , Pn} of n probes and a set = {C1, C2, ... , Ck} of k clones generated using the sampling-without-replacement protocol described earlier and the k x n clone-probe hybridization matrix H containing both false positives and false negatives with predefined probabilities, reconstruct the correct ordering {Pi} = ({pi}1, {pi}2, ... , {pi}n) of the probes and also the correct spacing Y = (Y1, Y2, ... , Yn) between the probes. The ordering {Pi} is a permutation of (1, ... , n) that gives the labels (indices) of the probes in left-to-right order across the chromosome. In the interprobe spacing vector Y, Y1 denotes the space between the left end of the first probe P{pi}1 and the left end of the chromosome, and Yi the spacing between the right end of probe P{pi}i-1 and the left end of probe P{pi}i (where 2 <= i <= n). The spacing between the right end of probe P{pi}n and the right end of the chromosome is given by Yn+1 = N - nM - {Sigma}ni=1 Yi, where N is length of the chromosome and M is the length of each clone/probe. Note that our protocol requires that all probes and clones be of the same length.

The problem as stated above is ill posed in the precise sense as defined by HADAMARD 1923 Down. A problem is deemed well posed when its solution exists and is unique. A problem is ill posed when no solution exists or, if a solution does exist, it is not unique. In our case, the problem is ill posed since the underlying constraints do not imply a unique solution. Any probe ordering {Pi} and any interprobe spacing vector Y that satisfies the requirements that Yi >= 0; 1 <= i <= n and N - nM - {Sigma}ni=1 Yi >= 0, constitute a valid solution. Hence the problem is formulated as one of determining a probe ordering and the interprobe spacings that maximize the likelihood of the observed hybridization matrix H given predefined probabilities for false positives and false negatives.

Mathematical notation:
The mathematical notation used in the formulation of the maximum-likelihood estimator is given below:

  • N, length of the chromosome;

  • M, length of a clone/probe;

  • n, number of probes;

  • k, number of clones;

  • {rho}, probability of false positive;

  • {eta}, probability of false negative;

H = ((hi,j))1<=i<=k, 1<=j<=n; clone-probe hybridization matrix, where

Hi, ith row of the hybridization matrix;

{Pi} = ({pi}1, ... , {pi}n), permutation of {1, 2, ... , n}, which denotes the probe labels in the ordering when scanned from left to right along the chromosome;

pi = {Sigma}nj=1 hij, number of 1's in Hi;

P = {Sigma}ki=1 pi, total number of 1's in H, Y = (Y1, Y2, ... , Yn), vector of interclone spacings, where Yi is the spacing between the right end of P{pi}i-1 and the left end of P{pi}i (2 <= i <= n), and Y1 is the spacing between the left end of P{pi}1 and the left end of the chromosome; and

{subseteq} n, set of feasible interprobe spacings Y = {Y1, ... , Yn} such that Yi >= 0, 1 <= i <= n, and N - nM - {Sigma}ni=1 Yi >= 0.

The model:
Given a vector of interprobe spacings Y = (Y1, ... , Yn), there are 2n+1 possible cases to consider depending on whether 0 <= Yi <= M or Yi > M, where 0 <= i <= n + 1. Without loss of generality, we present the maximum-likelihood model for n = 3 and illustrate 3 of the 24 = 16 possible cases.

Case 1:
0 <= Y1 <= M, 0 <= Y2 <= M, 0 <= Y3 <= M, and 0 <= Y4 <= M as depicted in Fig 2.



View larger version (7K):
In this window
In a new window
Download PPT slide
 
Figure 2. Interprobe spacings: Case 1.

Case 2:
0 <= Y1 <= M, 0 <= Y2 <= M, 0 <= Y3 <= M, and Y4 > M as depicted in Fig 3.



View larger version (7K):
In this window
In a new window
Download PPT slide
 
Figure 3. Interprobe spacings: Case 2.

Case 3:
Y1 > M, 0 <= Y2 <= M, 0 <= Y3 <= M, and 0 <= Y4 <= M as depicted in Fig 4.



View larger version (6K):
In this window
In a new window
Download PPT slide
 
Figure 4. Interprobe spacings: Case 3.

In Case 1 above, if the left end of a clone falls in regions A, C, or E, then the clone will hybridize only with P{pi}1, P{pi}2, or P{pi}3, respectively (Fig 2). Similarly, if the left end of a clone falls in region B or D, then the clone will hybridize with both P{pi}1 and P{pi}2 or P{pi}2 and P{pi}3, respectively (Fig 2). In Case 2 above, if the left end of a clone falls in region E, then the clone will hybridize with only P{pi}3 and if it falls in region F, then the clone will not hybridize with any of the probes (Fig 3). Similarly, in Case 3 above, if the left end of a clone falls in region A, the clone will not hybridize with any of the probes (Fig 4). Therefore, from the above three cases it is easy to see that, in general, there will be three different types of regions, namely,

  • Type 1: Both region between probe P{pi}j and P{pi}j+1, for j = 1, ... , n - 1. An intervening clone hybridizes to both probes if its left end falls in this region.

  • Type 2: Only region of probe P{pi}j, for j = 1, ... , n. A clone will hybridize to P{pi}j only if its left end falls in this region.

  • Type 3: None region after probe P{pi}j, for j = 0, ... , n. A clone will hybridize to no probe if its left end falls in this region. Here probe P{pi}0 is referred to as the beginning of the chromosome.

It can be shown that for j =1, ... , n - 1, Length of the Both region between probes P{pi}j and P{pi}j+1

(1)

and for j = 1, ... , n, Length of the Only region of probe P{pi}j

(2)

and for j = 0, ... , n, Length of the None region after probe P{pi}j

(3)

We assume that the left ends of the clones are uniformly distributed over the interval [0, N - M]; i.e., the probes are uniformly distributed across the length of the chromosome. Therefore it can be shown that for j = 1, ... , n - 1, the probability PBoth that a randomly chosen clone will fall in the Both region of probes P{pi}j and P{pi}j+1 is given by

(4)

for j = 1, ... , n the probability POnly that a randomly chosen clone will fall in the Only region of probe P{pi}j is given by

(5)

and for j = 0, ... , n the probability PNone that a randomly chosen clone will fall in the None region after probe P{pi}j is given by

(6)

The conditional probability of observing a clonal signature Hi (i.e., the ith row in the hybridization matrix H), given a probe ordering {Pi} and an interprobe spacing vector Y, is given by

(7)

where Oi,j is the event that the clone i will fall in the Only region of probe P{pi}j, Bi,j is the event that the clone i will fall in the Both region of probes P{pi}j and probe P{pi}j+1, and Ni,j is the event that the clone i will fall in the None region after probe P{pi}j.

For a given probe ordering {Pi} and interprobe spacing vector Y, event Oi,j implies that only hi,{pi}j = 1, and all the remaining entries in row Hi should be equal to 0. In other words, hi,{pi}j != 1 implies a false negative and a 1 in any other column position in the row Hi implies a false positive. That is,

(8)

and for k = 1, ... , n, where k != j,

(9)

Assuming that the false positive and false negative errors at different positions along the clonal signature Hi are independent of each other,

(10)

Similarly, for a given probe ordering {Pi} and interprobe spacing vector Y, event Bi,j implies that only hi,{pi}j and hi,{pi}j+1 should be equal to 1 and all the remaining entries of row Hi should be equal to 0. That is,

(11)


(12)

and for k = 1, ... , n, where k != j, j + 1,

(13)

Hence,

(14)

Finally, for a given probe ordering {Pi} and interprobe spacing vector Y, event Ni,j implies that all the elements of row Hi should be equal to 0. That is, for k = 1, ... , n,

(15)

Hence,

(16)

From Equation 4Equation 5Equation 6Equation 7Equation 8Equation 9 HREF="#FD10">Equation 10Equation 11Equation 12Equation 13Equation 14Equation 15 HREF="#FD16">Equation 16,

(17)

We assume that the clones {isin} are independently distributed along the chromosome; i.e., each row of H is independent of the other rows. Hence

(18)

From Equation 17 and Equation 18,

(19)

where

(20)


(21)

and

(22)

The goal, therefore, is to determine {Pi} and Y that maximize P(H | {Pi}, Y) as given in Equation 19, that is, determine (, Y), where

(23)

Alternatively, consider the negative log-likelihood function f({Pi}, Y) given by

(24)

where C is a constant given by

(25)

and {pi}0 = {pi}n+1 = 0. Since ln x is a monotonically increasing function of x for all x > 0, it follows that

(26)

Computation of the ML estimate:
Computing the values of and Y (Equation 26) involves a two-stage procedure.

Stage 1: First determine the optimal spacing Y{Pi} for a given probe ordering {Pi}; i.e., determine Y{Pi} = (Y1, ... , Yn) such that for a given {Pi},

(27)

Here the minimum is taken over all feasible solutions Y that satisfy the constraints Yi >= 0; i = 1, ... , n and {Sigma}ni=1 Yi <= N - nM.

Stage 2: Second determine for which

(28)

Here the minimum is taken over all {Pi}, where {Pi} is a permutation of {1, ... , n}. The resulting values of and Y are termed the ML estimates (MLEs) of the true probe ordering and the interprobe spacings, respectively.

Computation of Y{Pi}: A region {subseteq} n is deemed to be convex if for any pair of points p, q {isin} , all points along the line segment {alpha}p + (1 - {alpha}) q {isin} , where 0 <= {alpha} <= 1. A function h: -> defined on a convex set is deemed convex if for all points {alpha}p + (1 - {alpha}) q {isin} , where 0 <= {alpha} <= 1, h ({alpha}p + (1 - {alpha})q) <= {alpha}h(p) + (1 - {alpha})h(q). Alternatively, if (d2/d{alpha}2)h >= 0 along the line segment {alpha}p + (1 - {alpha})q, then function h can be shown to satisfy the above condition for convexity (SHETE 1998 Down). Furthermore, a region {subseteq} is considered good if for all Y {isin} , Yi != M, 1 <= i <= n + 1. The significance of a good region is that f{Pi} (Y) is differentiable within it.

The objective function f{Pi}(Y) (Equation 27) can be expressed as

(29)

where fi(Y) = Ri - {Sigma}n+1j=1 (ai,{pi}j - 1)(ai,{pi}j-1 - 1)Yj. Consider a good convex region {subseteq} , where Yj != M for 1 <= j <= n. Consider all points Y = P + sV for s > 0, which lie on a ray originating at a given point P {isin} in the direction V. In the derivative of f along the ray is given by

(30)

where

(31)

and I(x) is a unit step function defined as

(32)

Using the fact that ()fi(Y) = 0 along the ray, it can be shown that

(33)

This implies that f{Pi}(Y) is convex in every good convex region and therefore possesses a unique local minimum that is also a global minimum. Consequently this minimum can be reached using continuous local search-based techniques such as gradient descent (i.e., steepest descent) or conjugate gradient descent (DORNY 1980 Down; KINCAID and CHENEY 1991 Down).

Consider the four disjoint subregions +1,+1, +1,-1, -1,+1, and -1,-1 within , where

(34)

Each of these regions is convex since they result from the intersection of half spaces. Also, since the derivative of f{Pi}(Y) is defined in the interior of each subregion, each subregion is good. Note that we can define the derivative on the boundary of each subregion a,b, a, b {isin} {-1, +1}, on the basis of the direction in which the boundary point is approached. Thus by selecting a starting point in each of the subregions (or as many subregions as possible without violating any feasibility constraints), one can compute a local minimum for f{Pi}(Y) in each of the subregions and select the minimum of these local minima to be the global minimum of f{Pi}(Y) (KECECIOGLU et al. 2000 Down).

The local minimum of f{Pi}(Y) in each of the aforementioned four disjoint subregions within can be reached using continuous local search-based methods such as the steepest descent technique or the conjugate gradient descent technique (DORNY 1980 Down; KINCAID and CHENEY 1991 Down). The steepest descent technique is a simple iterative procedure that consists of three steps: (i) Determine the initial value of Y,(ii) compute the downhill gradient at Y, and (iii) update the current value of Y using the computed value of the downhill gradient. Steps (ii) and (iii) are repeated until the gradient vanishes. The point at which the gradient vanishes is considered to be the desired local minimum. In practice, the steepest descent procedure is halted when the magnitude of the gradient is less than a prespecified threshold.

The initial value of Y = (Y1, ... , Yn) can be determined in one of several ways; the simplest solution is to assign (N - nM)/(n + 1) to each of Yi's, i.e., distribute the value of N - nM equally among the (n + 1)Yi's. Having obtained an initial value for Y, the gradient of the negative log-likelihood function is computed. A function decreases most rapidly in the direction of the local negative (i.e., downhill) gradient. The local downhill gradient is given by

(35)

where

(36)

and

(37)

The current value of Y = Yold is updated by moving along the downhill gradient direction U = -{nabla}f({Pi}, Y)|Y=Yold. The new value of Y = Ynew is given by

(38)

The problem, therefore, is to find an optimal value of s, say s*, such that

(39)

Here,

(40)

where Yn+1 = N - nM - {Sigma}ni=1 Yi.

Having obtained the value of s*, then the new interprobe spacings are given by

(41)

To determine an optimal value of s = s*, consider

(42)

where Un+1 = -{Sigma}ni=1 Ui. The convexity of f{Pi}(Y) implies that the local optimum for s is also a global optimum. Also note that we have the following boundary conditions: (i) Yj + sUj >= 0, for j = 1, ... , n; i.e., all the spacings are nonnegative; and (ii) {Sigma}nj=1(Yj + sUj) <= N - nM, which is a constraint imposed by the length of the chromosome and the length of each clone.

The above constraints impose bounds on s given by

(43)

Given the upper and lower bounds on the values of s and the convexity of f({Pi}, Y + sU) with respect to s, the bisection method (KINCAID and CHENEY 1991 Down) can be used to find the optimal value of s. The number of iterations of the bisection method required to localize the minimum within a given tolerance {epsilon} is given by n = log2[].

The gradient computation and the solution update steps of the steepest descent method are continued until the gradient vector attains a magnitude less than a predefined threshold. During this iterative process, it may happen that the current value of Y is on one or more of the boundaries of the feasible region in which the solution is located. These boundaries are defined by the constraints on the Yi's for i = 1, ... , n as discussed earlier. The fact that the current value of Y lies on one or more of the boundaries of the feasible region and downhill gradient vector is determined to point outside the feasible region would normally force the value of s to be equal to zero and hence stop the iterative procedure even though the gradient vector has not vanished.

The above difficulty can be overcome by projecting the downhill gradient vector U onto the feasible region. Note that all the boundary conditions on the Yi's for i = 1, ... , n are hyperplanes. Suppose that U is directed outside the feasible region, and we have k hyperplanes (corresponding to k boundary conditions) that force s to be equal to zero. In this situation one needs to project U onto the intersection of these hyperplanes. Let 1, ... , k be the normal vectors to these hyperplanes. If the boundary conditions are not redundant, then {1, ... , k} constitutes a linearly independent set of vectors. The set {1, ... , k} can be transformed into a mutually orthonormal set of vectors given by {1, ... , 'k} using the Gram-Schmidt orthonormalization procedure (DORNY 1980 Down). The projected vector Uproj is then given by:

(44)

The minimization procedure then proceeds along Uproj instead of U. In the limiting case when k = n, the minimization procedure has reached an extremal vertex of the admissible region and Uproj = 0. In this case, the extremal vertex is the desired minimum within the admissible region. Thus, the minimization procedure is halted when U vanishes or when an extremal vertex is reached (i.e., Uproj vanishes) depending on which situation is encountered first.

Computation of : Determining the optimal clone ordering entails a combinatorial search through the discrete space of all possible permutations of {1, ... , n}. The problem of coming up with such an optimal ordering is isomorphic to the classical nondeterministic polynomial-complete optimal linear arrangement (OLA) problem for which no polynomial-time algorithm for determining the optimal solution is known (GAREY and JOHNSON 1979 Down). Heuristic search algorithms that are capable of arriving at near-optimal solutions in polynomial time on average are therefore desirable. However, deterministic hill-climbing (i.e., "local" or "greedy") search algorithms have a tendency to get trapped in a local optimum that may be far from a desirable global optimum. An attractive alternative is to use a stochastic hill-climbing search algorithm such as simulated annealing (SA; KIRKPATRICK et al. 1983 Down; GEMAN and GEMAN 1984 Down) or microcanonical annealing (MCA; CREUTZ 1983 Down), both of which are known to be robust in the presence of local optima in the solution space.

A single iteration of a stochastic hill-climbing search algorithm consists of three phases: (i) perturb, (ii) evaluate, and (iii) decide. In the perturb phase, the current solution xi to a multivariate objective function E(x), which is to be minimized, is systematically perturbed to yield another candidate solution xj. Here, the probe ordering is systematically perturbed by reversing the ordering within a block of probes where the endpoints of the block are chosen at random. In the evaluate phase, E(xj) is computed. Here, the function f({Pi},Y{Pi}) (Equation 27) is computed. In the decide phase, xj is accepted and replaces xi probabilistically, using a stochastic decision function. The stochastic decision function is annealed in a manner such that the search process resembles a random search in the earlier stages and a greedy local search or a deterministic hill-climbing search in the latter stages. The major difference between simulated annealing and microcanonical annealing arises from the difference in the stochastic decision function used in the decision phase. Their common feature is that, starting from an initial solution, they generate, in the limit, an ergodic Markov chain of solution states that asymptotically converges to a stationary Boltzmann distribution (AARTS and KORST 1989 Down). The Boltzmann distribution asymptotically converges to a globally optimal solution when subjected to the annealing process (GEMAN and GEMAN 1984 Down).

Simulated annealing: In the decide phase of SA, the new candidate solution xj is accepted with probability p, which is computed using the Metropolis function (METROPOLIS et al. 1953 Down)

(45)

or, using the Boltzmann function B(T),

(46)

(AARTS and KORST 1989 Down) at a given value of temperature T, whereas xi is retained with probability (1 - p).

The Metropolis function and the Boltzmann function give SA the capability of climbing out of local minima. Several iterations of SA are carried out for a given value of T and are referred to as an annealing step. The value of T is systematically reduced using an annealing function. As can be seen from Equation 45 and Equation 46, at sufficiently high temperatures SA resembles a completely random search, whereas at lower temperatures it acquires the characteristics of a deterministic hill-climbing (i.e., local or greedy) search.

SA generates an asymptotically ergodic (and hence stationary) Markov chain of solution states at a given temperature using Equation 45 and Equation 46. Logarithmic annealing schedules of the form Tk = k for some value of R > 0 have been shown to be asymptotically good (GEMAN and GEMAN 1984 Down); i.e., they ensure asymptotic convergence to a global minimum with unit probability in the limit k -> {infty} (GEMAN and GEMAN 1984 Down). The convergence criterion used here was the fact that the value of the objective function had not changed for the past k annealing steps. A geometric annealing schedule of the form Tk+1 = {alpha}Tk was used where {alpha} = 0.9. Although the geometric annealing schedule does not strictly ensure asymptotic convergence to a global optimum as does the logarithmic annealing schedule, it is much faster and has been found to yield very good solutions in practice (ROMEO and SANGIOVANNI-VINCENTELLI 1991 Down).

Microcanonical annealing: MCA models a physical system whose total energy, i.e., the sum of kinetic energy and potential energy, is always conserved. The potential energy of the system is the multivariate objective function E(x) to be minimized, whereas the kinetic energy Ek > 0 is represented by a demon or a collection of demons. In the latter case, the total kinetic energy is the sum of all the demon energies. The demon energy (or energies) serve(s) to provide the system with an extra degree (or degrees) of freedom thus enabling MCA to escape from local minima.

In the decide phase of MCA, if E(xj) < E(xi), then xj is accepted as the new solution. If E(xj) >= E(xi), then xj is accepted as the new solution only if Ek >= E(xj) - E(xi). If E(xj) >= E(xi) and Ek < E(xj) - E(xi) then the current solution xi is retained. In the event that xj is accepted as the new solution, the kinetic energy demon is updated to En+1k = Enk + [E(xi) - E(xj)] to ensure the conservation of the total energy. The kinetic energy parameter Ek is annealed in a manner similar to the temperature parameter T in SA. MCA can also be shown to converge to a global minimum with unit probability given a logarithmic annealing schedule (BHANOT et al. 1984 Down).

In the context of probe ordering, a kinetic energy demon is assigned to each distinct pair of probes. A square symmetric matrix Ek of size n x n, where n is the number of probes, is used to store the demon energy for each probe pair. Thus, an entry Ek(i, j) refers to the kinetic energy of the demon associated with the ith and jth probe. As in the case of SA, the convergence criterion used was the fact that the value of the objective function had not changed for the past k annealing steps. A geometric annealing schedule of the form Em+1k(i, j) = {alpha}Emk(i, j) was used where {alpha} = 0.9.

In spite of the robustness of SA and MCA to the presence of local minima in the solution landscape, the annealing schedule needed for asymptotic convergence is computationally intensive. This provides the motivation for the parallel computation of the maximum-likelihood estimator.


*  PARALLEL COMPUTATION OF THE MAXIMUM-LIKELIHOOD ESTIMATOR
*TOP
*ABSTRACT
*The physical mapping protocol
*MATERIALS AND METHODS
*MATHEMATICAL FORMULATION OF THE...
*PARALLEL COMPUTATION OF THE...
*EXPERIMENTAL RESULTS
*CONCLUSIONS AND DISCUSSION
*LITERATURE CITED

Computation of the maximum-likelihood estimator entails two levels of parallelism corresponding to the two stages of optimization discussed in the previous section:

  • Level 1: Parallel computation of the optimal interprobe spacing Y{Pi} for a given probe ordering {Pi} (Equation 27). This entails parallelization of the gradient descent search procedure for constrained optimization in the continuous domain.

  • Level 2: Parallel computation of the optimal probe ordering (Equation 28). This entails parallelization of the stochastic hill-climbing search procedure (SA or MCA) for optimization in the discrete domain.

Both levels of parallel computation were implemented on PVM (SUNDERAM 1990 Down), which is a software environment designed to exploit parallel/distributed computing across a variety of computing platforms. PVM is based on a distributed-memory message-passing paradigm of parallel computing. The interested reader is referred to GEIST et al. 1994 Down for a more detailed description of PVM.

Parallel stochastic hill-climbing search:
Parallelization of annealing algorithms has been attempted by several researchers especially in the area of computer-aided design, image processing, and operations research (GREENING 1990 Down; AZENCOTT 1992 Down). Parallelization strategies for the SA and MCA algorithms can be categorized as follows:

  1. Functional parallelism within a move where the task of evaluating each move is decomposed into subtasks that are performed in parallel by multiple processors (WONG and FIEBRICH 1987 Down).

  2. Control parallelism with multiple active iterations with processors engaged in speculative computation (WITTE et al. 1991 Down).

  3. Control parallelism with parallel Markov chain generation using a systolic algorithm (AARTS et al. 1986 Down; GREENING 1990 Down; KIM and KIM 1990 Down; AZENCOTT 1992 Down).

  4. Control parallelism with multiple searches of the solution space where the searches could be interacting, in which case the processors exchange data periodically, or noninteracting in which case the processors proceed independently of each other (AZENCOTT 1992 Down; LEE 1995 Down).

  5. Data parallelism with the state variables in a multivariate solution vector distributed among the individual processors in the multiprocessor architecture accompanied by either (1) parallel evaluation of multiple moves with acceptance of a single move (CASOTTO et al. 1987 Down), or (2) parallel evaluation of multiple moves with acceptance of multiple noninteracting moves (JAYARAMAN and RUTENBAR 1987 Down), or parallel evaluation and acceptance of multiple moves (BANERJEE et al. 1990 Down).

The authors' earlier work in chromosome reconstruction involved ordering clones, i.e., rows of the hybridization matrix H as shown in Table 1 for physical map generation (CUTICCHIA et al. 1992 Down, CUTICCHIA et al. 1993 Down). The clones were ordered on the basis of minimization of the total pairwise Hamming distance between the binary hybridization signatures of successive clones in a given permutation. This clone ordering problem was also shown to be isomorphic to the optimal linear arrangement problem (GAREY and JOHNSON 1979 Down), and SA and MCA were used to arrive at the optimal clone ordering, which was represented by a global minimum of the total pairwise Hamming distance objective function.The authors' efforts at parallelizing SA and MCA in the context of the aforementioned clone ordering problem showed control parallelism based on multiple searches to be the most promising for implementation on a distributed memory multiprocessor platform such as PVM (BHANDARKAR et al. 1996A Down, BHANDARKAR et al. 1996B Down, BHANDARKAR et al. 1998 Down; BHANDARKAR 1997 Down; BHANDARKAR and MACHAKA 1997 Down). Since a candidate solution in the serial SA or MCA algorithm can be considered to be an element of an asymptotically ergodic first-order Markov chain of solution states, two models of parallel SA (PSA) and parallel MCA (PMCA) algorithms were formulated and implemented based on the distribution of the Markov chain of solution states on a workstation cluster running PVM. These models incorporate the parallelization strategies discussed under item (D) above and are described below:

  • The noninteracting local Markov chain (NILM) PSA and PMCA algorithms.

  • The periodically interacting local Markov chain (PILM) PSA and PMCA algorithms.

In the NILM PSA and NILM PMCA algorithms, each processor within the PVM system runs an independent version of the serial SA or MCA algorithm. In essence, there are as many Markov chains of solution states as there are physical processors within the PVM system. Each Markov chain is local to a given processor and at any instant each processor maintains a candidate solution that is an element of its local Markov chain of solution states. The serial SA and MCA algorithms run asynchronously on each processor; i.e., at each temperature value or kinetic energy value each processor iterates through the perturb-evaluate-accept cycle concurrently (but asynchronously) with all the other processors.

The perturbation function uses a parallel random number generator to generate the Markov chains of solution states. By assigning a distinct seed to each processor at the start of execution, it is ensured that each processor contains a Markov chain of solution states that is independent from those in most or all of the other processors. The evaluation function and the decision function are executed concurrently on the solution state within each processor. On termination of the annealing processes on all the processors, the best solution is selected from among all the solutions available on the individual processors. The NILM model is essentially that of multiple independent (i.e., noninteracting) searches as described in item (D) above.

The PILM PSA and PILM PMCA algorithms are similar to their NILM counterparts except for one major difference. Just before the parameter T or Ek is updated using the annealing function, the best candidate solution from among those in all the processors is selected and duplicated on all the other processors. The goal of this synchronization procedure is to focus the search in the more promising regions of the solution space. This suggests that the PILM PSA or PILM PMCA algorithm should be potentially superior to its NILM counterpart. It should be noted that in the case of all four algorithms, PILM PSA, PILM PMCA, NILM PSA, and NILM PMCA, a single Markov chain of solution states is generated entirely within a single processor. The PILM model is essentially that of multiple periodically interacting searches as described in item (D) above.

In the case of all the four algorithms, NILM PSA, NILM PMCA, PILM PSA, and PILM PMCA, a master process is used as the overall controlling process. The master process runs on one of the processors within the PVM system. The master process spawns child processes on each of the other processors within the PVM system, broadcasts the data subsets needed by each child process, collects the final results from each of the child processes, and terminates the child processes. The master process, in addition to the above-mentioned functions for task initiation, task coordination, and task termination, also runs its own version of the SA or MCA algorithm just as does any of its child processes.

In the case of the PILM PSA and the PILM PMCA algorithms, before the parameter T or Ek is updated, the master process collects the results from each child process along with its own result, broadcasts the best result to all the child processes, and also replaces its own result with the best result. The master process updates its temperature or kinetic energy value using the annealing schedule and proceeds with its local version of the SA or MCA algorithm. On convergence, the master process collects the final results from each of the child processes along with its own, selects the best result as the final solution, and terminates the child processes.

Each of the child processes in the PILM PSA and the PILM PMCA algorithms receives the initial parameters from the master process and runs its local version of the SA or MCA algorithm. At the end of each annealing step, each child process conveys its result to the master process, receives the best result thus far from the master process, and replaces its result with the best result thus far before proceeding with the next annealing step. On convergence, each child process conveys its result to the master process. The master process and child process for the PILM PSA and the PILM PMCA algorithms are depicted in Fig 5 and Fig 6, respectively.



View larger version (36K):
In this window
In a new window
Download PPT slide