Genetic Diversity and Population History of Golden Monkeys (Rhinopithecus roxellana)
Haipeng Li, Shi-Jie Meng, Zheng-Ming Men, Yun-Xin Fu, Ya-Ping Zhang


Golden monkey (Rhinopithecus roxellana), namely the snub-nosed monkey, is a well-known endangered primate, which distributes only in the central part of mainland China. As an effort to understand the current genetic status as well as population history of this species, we collected a sample of 32 individuals from four different regions, which cover the major habitat of this species. Forty-four allozyme loci were surveyed in our study by allozyme electrophoresis, none of which was found to be polymorphic. The void of polymorphism compared with that of other nonhuman primates is surprising particularly considering that the current population size is many times larger than that of some other endangered species. Since many independent loci are surveyed in this study, the most plausible explanation for our observation is that the population has experienced a recent bottleneck. We used a coalescent approach to explore various scenarios of population bottleneck and concluded that the most recent bottleneck could have happened within the last 15,000 years. Moreover, the proposed simulation approach could be useful to researchers who need to analyze the non- or low-polymorphism data.

THE genus Rhinopithecus, which consists of four species, is found only in Asia. One member species, Rhinopithecus roxellana, is widely known as golden monkey or snub-nosed monkey for its shining golden coat and funny snub nose. Golden monkey is found only in the central part of mainland China (Figure 1) and because of its distinct appearance and rarity, the golden monkey is a national icon albeit less famous than the giant panda (Ailuropoda melanoleuca). Similar to the situation of giant panda, fragmented and deteriorating habitat has severely threatened the very existence of golden monkey. R. avunculus, another member species, distributes only in northern Vietnam; R. bieti and R. brelichi, the remaining two member species, are found in southwestern China. Current population census sizes of R. roxellana, R. bieti, R. brelichi, and R. avunculus are ∼10,000–20,000, 1000–1500, 500–600, and 100–200, respectively (Ma and Wang 1988; Canh 1995; Wanget al. 1998b). Because of the small population size, all these four species are formally recognized as vulnerable or endangered species by The World Conservation Union (2000 IUCN Red List of Threatened Species, and have drawn a lot of attention in various conservation efforts.

Genetic diversity of a population conveys valuable information about the health status of the population as well as its demographic history. However, one common problem in genetic studies of endangered species is the scarcity of samples. After 2 years of sample collection, we were able to collect blood samples from 32 individuals. One of the purposes of this article is to report our genetic study of this sample as part of our continuous effort to understand the current genetic status as well as population history of this species. Prior studies based on fossil record and the high prevalence of dental agenesis have suggested that the golden monkey experienced a late Pleistocene bottleneck (Jablonski 1992); however, a more quantitative study of the population history has not been made because of lack of proper data. This study offers an opportunity to examine various issues on the recent history of this species, including when the bottleneck occurred and how small the population size could be at bottleneck.

The bottleneck effect, including the loss of genetic diversity due to the rapid loss of rare alleles; the increased additive genetic variance caused by dominance, epistasis, or both; and variances of evolutionary rates among lineages, has been investigated by many authors (e.g., Maruyama and Fuerst 1985; Livshits and Nei 1990; Araki and Tachida 1997; Wanget al. 1998a). Meanwhile, a number of methods have been suggested for detecting the bottleneck or population expansion and estimating population parameters (Slatkin and Hudson 1991; Saccheriet al. 1999). Typically, polymorphism in a sample is needed to perform those tests and estimate population parameters. Therefore, all these methods are not applicable to the data that have no polymorphism, which we found in an endangered primate, R. roxellana. We use coalescent theory and simulation to explore answers to various questions of the population history, and our analysis as well as simulation approach should also be useful to researchers who need to analyze similar data.

Figure 1.

—The current range of R. roxellana is shown by enclosed vertical lines (revised from Ma and Wang 1988; Wanget al. 1998b). The sampling locations for all samples used in this study are shown by arrowheads. The sample size from each location is shown in parentheses.


Blood samples of 32 R. roxellana were obtained from wild living individuals who originally came from four regions (Figure 1). Those regions cover the major habitat of this species. The collection method followed Su et al. (1994). Common medical procedure was used to draw blood from the animals. To minimize the effect to the blood donors, the process was done in quiet environments that calmed the monkeys. Thirteen individuals coming from Qinling Mountain were collected at the same time, but it is not clear if they belonged to one or different social groups. Sixteen monkeys from Minshan Mountain were obtained in two collections. The sex statuses of most of the individuals in the sample were not recorded. After blood samples were collected, seven buffer systems (revised from Suet al. 1994) were used in starch gel electrophoresis. They were (I) Tris-citrate, pH 7.0; (II) Tris-borate-EDTA, pH 8.6; (III) Tris-borate-Na2EDTA, pH 8.0; (IV) Tris-boratecitrate-LiOH, pH 8.0; (V) borate-NaOH, pH 8.6; (VI) Trisborate-citrate-NaOH, pH 7.8; and (VII) Tris-citrate, pH 8.7. Forty-four genetic loci were investigated here, and most of them were used in Su et al. (1994) for the study of R. bieti. The loci and their buffer systems are listed in Table 1. Staining methods followed Pasture et al. (1990). Moreover, we did not investigate ALB, AKP, CP, ES-1, HAP, LAP, Pα2, and TF loci in 9 individuals because their plasma was not available. Of the 9 individuals, 7 were from Minshan Mountain and 2 from Qinling Mountain. For the sake of quality control and comparison, we also collected 25 blood samples of rhesus macaques (Macaca mulatta) and conducted the same experimental analysis.

The genetic diversity is measured by two parameters, mean heterozygosity (H) and the percentage of polymorphic loci. We note that h=1Σi=1mpi2 , where h is the heterozygosity at one locus and pi is the frequency of the ith allele. Then H is the mean value of h over all surveyed loci (Pastureet al. 1990).

Coalescent theory and simulation is the main approach we used to analyze the data, and it is most convenient to discuss them together with results of the experiment.


In this study, the sample consisted of 32 individuals, most of whom were surveyed over 44 loci. We failed to get a numerically large sample because sample collection for an endangered wild primate is extremely difficult. Meanwhile, the samples could fail to represent the genuine genetic diversity of whole species when all of them come from one social group or single regional population. So we made our best effort to cover the major habitat of R. roxellana during the collection of samples.

To our surprise, none of these loci was polymorphic in the sample. Since many of the same loci were found polymorphic in our controls, samples of rhesus macaques (H. Li, unpublished results), we are confident of the accuracy of our experiment. Therefore, the mean heterozygosity and the percentage of polymorphic loci are zero. This value indicates that R. roxellana has a rather invariant gene pool, similar to the giant panda (A. melanoleuca; Suet al. 1994; Zhanget al. 1997), cheetah (Acinonyx jubatus; O'Brienet al. 1983), and northern elephant seal (Mirounga angustirostris; Bonnell and Selander 1974). Most surveyed loci in this study were used to survey the genetic diversity of giant panda (33/40 = 83%; Suet al. 1994), and the observed genetic diversity of R. roxellana is even a little lower than that of giant panda (H = 0.008). Furthermore, it is shown that R. roxellana's genetic diversity was the poorest by comparing the mean heterozygosity among nonhuman primates (Figure 2). Its genetic diversity is even lower than R. bieti's although it has a relatively large population since the population size of R. roxellana is ∼10–20 times that of R. bieti (Ma and Wang 1988; Wanget al. 1998b). The allozyme frequency data in these nonhuman primates are comparable due to the number of surveyed loci in each study being >20.

View this table:

The list of detected enzyme loci and their buffer systems and abbreviations

One could construct a complex scenario, by combining several different types of selection to explain R. roxellana's genetic purity, for example, the background selection and the selective sweep at linked loci (Maynard-Smith and Haigh 1974; Charlesworthet al. 1995). However, statistical studies and experiments inferred that the majority of allozyme polymorphism data were consistent with the neutral hypothesis (Paulet al. 1977; McCommas and Bryant 1990; Woodwarket al. 1992), and a quantitative test of the neutral theory, which used pooled information covering many species of the animal kingdom, provided further support for the neutral hypothesis (Skibinskiet al. 1993). The neutral hypothesis assumes that the majority of mutations that have contributed significantly to the genetic diversity in natural populations are neutral or nearly neutral (Kimura 1989, 1991). To date, no study has demonstrated that loss of genetic diversity is caused by natural selection in wild primate populations. Importantly, the selective events affect distinct regions of the genome to a varying extent, so it is an unlikely cause for a complete lack of polymorphism in all 44 loci. Moreover, Frankham (1996) suggested the negative relationship between genetic diversity and body size; however, the body size of R. roxellana is much smaller than that of Gorilla gorilla while the genetic diversity of the former is far less than that of the latter (Figure 2), so Frankham's hypothesis could not explain the observed low genetic diversity very well in this case. In principle, the genetic purity of R. roxellana could also arise from a low mutation rate, but the phylogenetic study of langurs suggests that there is no significant rate difference in the mitochondrial genome among Rhinopithecus and other monkeys (Pygathrix, Semnopithecus, and Macaca; Zhang and Shi 1998). Therefore, the assumption of low mutation rate is not supported by other independent data. Furthermore, we argue that the low polymorphism of R. bieti (Figure 2) could be caused mostly by the small effective population size (census size 1000–1500) instead of the low mutation rate.

Figure 2.

—The comparison in mean heterozygosity among 14 nonhuman primates (data are from Nevo 1984; Smithet al. 1978; Suet al. 1994). The number of genetic loci detected is >20.

Given the fact that there is no variation in all of the 44 loci and that the current census size of the species is relatively large (∼15,000), a bottleneck appears to be the most plausible cause of the observation. Since a prior study also indicated the existence of a bottleneck, we focus on and investigate issues related to a bottleneck. We use a coalescent approach to investigate the effects of various bottleneck scenarios on the likelihood of the data. The coalescent approach is highly efficient for simulating population samples and has been widely used for analyzing DNA polymorphism (see Li and Fu 1998; Fu and Li 1999; Kingman 2000 for review). There are a number of models of population history of which the constant population and exponential growth models are considered in this study. The former is a standard model and has been treated extensively (for example, see Fu and Li 1999) and thus is not discussed here in detail. Instead, we discuss the exponential model.

We consider a two-phase model of exponential growth. The first phase assumes that the population size is constant, and the second phase assumes that the size change is exponential. Let N0 be the effective population size for the first phase, and then the effective population size for the second phase is Nt=N0eαt, (1) where α is the growth rate and Nt is the effective population size at time t since the start of exponential growth. Assume that a sample is taken at time T, NT is then the effective population size at the time of sampling. The coalescent theory for simulating coalescent time under exponential growth was developed by Griffiths and Tavaré (1994), and only a minor modification is needed to study our two-phase model. The model can be characterized by three parameters, r = N0/NT, T, and θ= 4NTμ, where μ is the mutation rate per locus per generation, and T is scaled so that one unit represents 4NT generations (Griffiths and Tavaré 1994). Note that when external information is available for μ, NT can be determined.

We are interested in the probability of observing the pattern of polymorphism in our data under various demographic histories; that is, the likelihood of our data given the values of parameters. Let Pi be the probability of no variation at locus i in a sample of 32 individuals (64 chromosomes), and assume that loci are independent (or in linkage equilibrium); then the likelihood of our data is L=i=144Pi. (2) The key to our analysis is to obtain the likelihood L, which requires evaluation of Pi. Analytically, Pi can be expressed as Pi=kelkθip(k), (3) where p(k) is the probability of history k (a genealogy of the sample) and θi = 4NTμi, μi is the mutation rate of the ith locus, and lk is the total length of the genealogy of 64 sequences for the locus (Figure 3). The summation in the above equation is taken over all possible histories of 64 chromosomes. Under the assumption of constant effective population size, Pi can be computed analytically. Since most scenarios we considered are with varying population size, we choose to compute Pi by estimation. We use coalescent simulation to generate a large number of samples for estimating the value of L. This analysis allows us to answer quantitatively a number of questions on the history of the population. Pi can be estimated from simulated samples as follows. Suppose M is the number of histories simulated. Then Pi can be estimated as P^i=1Mk=1Melkθi. (4) The accuracy of such estimate can be controlled by the replicate number M. Then the total likelihood can be estimated by L^=i=144P^i. (5) Since loci are assumed to be independent, i can be estimated separately. One advantage of this approach is that sample sizes for different loci can be different. We checked the estimation against analytical values in the case of constant population size and found that they agreed well, since not every individual was typed at all loci due to incomplete plasma samples, which resulted in slightly different sample sizes over loci. The approach can easily handle such a situation.

Figure 3.

—An illustration of simulation for five sequences. The numbers are the lengths of branches, which are scaled by 4N. The tree is created according to the proper coalescent process. In this case, the total length of the tree (l) is 2.78. The expected number of mutations on the tree is 2.78θ, and the probability that no mutation appears is e–2.78θ.

Figure 4.

—The likelihood values with different θ under two situations, when there is no rate variation among loci and when the mutation rate varies among loci (the gamma distribution). Only one curve is shown because the likelihood values are the same under two situations when the value of θ is fixed.

Mutation rates usually vary among loci (Skibinskiet al. 1993), and the gamma distribution has been widely used to describe the variation of amino acid and DNA substitution rates (e.g., Tajima 1996; Nei and Kumar 2000). Here, we assume that θ follows the gamma distribution f(θ)=βαΓ(α)eβθθα1, (6) where α is the shape parameter of the gamma distribution equal to [E(θ)]2/Var(θ), β=α/E(θ), and E(θ) is the expectation of θ. Zhang and Gu (1998) estimated the variation of mutation rate among sites and α varies from 0.2 to 3.5; therefore, α= 2.0 is used in the simulations in addition to the model that assumes no rate variation among loci.

The coalescent simulation shows that the likelihood L will be at least 5% only when θ (= 4NTμ) is not >0.015 under the constant population size and no rate variation among loci. Moreover, the same conclusion for θ is found when E(θ) = 0–0.015 under the constant population size and when the rate varies among loci according to the gamma distribution (Figure 4). The overall range of likelihood values is very similar between models with or without rate variation (Figure 4). That is, the likelihood value is much more sensitive to θ than to α.

The range of NT could be estimated when mutation rate μ is known. Nei (1977) estimated that the mutation rate is 2.3 × 10–6/locus/generation, which means that in this situation the effective population size is ∼1630. This value is far smaller than the census size (∼15,000). Considering the fact that this species must have had a larger population size sometime before being designated as an endangered species, 1630 as a recent effective population size is likely an underestimate, which suggests that the constant population model is not the most appropriate. Moreover, we can see that even when the rate is one-half of 2.3 × 10–6, the effective population size Nt is only 3260.

Figure 5.

—The contour plot of likelihood that is equal to 0.05. The plot is based on the result from the exponential population model, assuming the mutation rate varies among loci (the gamma distribution). The x-axis is the generation time after the bottleneck (t), the y-axis is the ratio of (r = N0/NT), and the z-axis is the value of E(θ), where NT is the current effective population size and N0 is the effective population size during the bottleneck, and t is scaled by 4NT generations.

Alternatively, the observed genetic purity could be explained by a recent population bottleneck. Several methods for detecting the existence of a bottleneck have been proposed, which are based on the fluctuation of gene frequencies, the heterozygosity excess, the star-like gene tree, the mismatch distribution (Slatkin and Hudson 1991; Cornuet and Luikart 1996; Beaumont 1999; Schneider and Excoffier 1999), or the instantaneous bottleneck assumption (Galtieret al. 2000). However, all these methods require having polymorphism in the sample. Therefore, they are not directly applicable in our situation.

The exponential growth model we consider here assumes an exponential growth after the bottleneck. Figure 5 plots the value of r, T, and E(θ) with the likelihood equal to 0.05. It is obvious that the probability is highest when the population went through a severe bottleneck and then expanded rapidly because the likelihood value with a small r and small T is higher than the likelihood value with a large r and large T. This is because after the recent bottleneck, the population has not had enough time to accumulate enough variations, and even the population size increased to a large number. Note that the exponential growth model becomes the constant size model when the value of r(N0/NT) = 1. Therefore, it includes the latter as a special value. The trend could be seen in Figure 5. The value of E(θ), the likelihood value of which is equal to 0.05, will be close to 0.015 when the value of r and T increases.

Since the current census size of golden monkey is ∼15,000, NT = 5000 is not unreasonable. Assume that the mutation rate is 2.3 × 10–6/locus/generation (Nei 1977), then θ= 0.046. With θ= 0.046, the likelihood of observing the data is exceedingly small (<0.1%) under the constant population size model. Even assuming NT = 2,500, the probability is still <1%. In comparison, the likelihood is ∼12% when r = T = 0.05 (Table 2) under the exponential growth model. Therefore, the bottleneck hypothesis is more reasonable than the constant population size hypothesis, and the conclusion is not sensitive to rate variation among loci.

View this table:

The likelihood values with different r(N0/NT) and T under the exponential growth model, and E(θ) = 0.046

Table 2 gives the range of r and T that can result in at least a 5% probability of observing our data. We can see that when θ= 0.046, r is between 0 and 0.2, and T between 0 and 0.15. Assume that the generation time of golden monkey is ∼5 years (its sex maturation time is 4–6 years). Then the bottleneck would likely have happened within the last 15,000 years during which the effective population size (N0) would not be >1000 and would likely be much smaller.


The previous studies suggested that a bottleneck might have happened to this species during the late Pleistocene (1.8 million to 11,000 years ago; Jablonski and Pan 1988; Jablonski 1992). The conclusion was based mainly on the following reasons. First, climatic deterioration in central and southwestern China was brought about by the elevation of the Tibetan Plateau and by the effects of episodic glaciations, and extinctions of mammalian species, including species of Rhinopithecus in China, were widespread during the late Pleistocene (Jablonski and Pan 1988; Jablonski 1992). Second, a high prevalence of dental agenesis, which has been associated with small population effects and genetic isolation, was observed in golden monkey, and Jablonski (1992) argued that the phenomenon could be explained by the late Pleistocene bottleneck. However, a quantitative analysis of when the bottleneck occurred has not been attempted before this study. This study reveals a very recent bottleneck event in which the species lost much of its genetic diversity. Our conclusion strengthens the previous qualitative one. It should be pointed out that although we conclude that a recent bottleneck has occurred to golden monkey, the conclusion does not exclude the possibility that the golden monkey might have had multiple bottlenecks within the past 1.8 million years.

The genetic diversity in natural populations is considered the raw material of evolution, and loss of genetic diversity might significantly decrease the ability of wild populations to survive climatic extremes (Frankham 1995; Saccheriet al. 1998). The low levels of genetic diversity that have been observed in many endangered animal species were suggested as one of the critical reasons leading to endangered or extinction status, for example, the Yunnan golden monkey (Su and Shi 1995), giant panda (Suet al. 1994; Zhanget al. 1997), and cheetah (O'Brienet al. 1983). Considering the extremely low genetic diversity that we discovered in the golden monkey, we suggest that, despite its relatively large population size, conservation efforts need to be continued, particularly those that encourage the creation of genetic diversity. It is fortunate that the golden monkey has a much lower newborn death rate and is more successful at breeding in captivity than the giant panda (Hu 1990). The higher fecundity is probably among the factors that have led to the recovery of this species after the recent severe bottleneck.


We are grateful to Oliver Ryder for his valuable comments on an earlier version of the article. We thank Bing Su, Wen Wang, Aihua Liu, Ruiqing Liu, Yongcheng Long, Bo Ding, Shikang Gou, Xiufan Shi, Jing Luo, Yuanchun Ding, Jingong Xiangyu, Xiangzhong Zheng, Yunwu Zhang, and Shiying Ling for their help. This work was supported by the Chinese Academy of Sciences (KSCX2-1-05), the State Key Basic Research and Development Plan of China (G2000046806), the National Nature Science Foundation of China, the Science Foundation of Yunnan Province in China, and National Institutes of Health grant R01 GM50426 (Yun-Xin Fu).


  • Communicating editor: W. Stephan

  • Received May 31, 2002.
  • Accepted January 27, 2003.


View Abstract