High Polymorphism at the Human Melanocortin 1 Receptor Locus
Brinda K. Rana, David Hewett-Emmett, Li Jin, Benny H.-J. Chang, Naymkhishing Sambuughin, Marie Lin, Scott Watkins, Michael Bamshad, Lynn B. Jorde, Michele Ramsay, Trefor Jenkins, Wen-Hsiung Li

Abstract

Variation in human skin/hair pigmentation is due to varied amounts of eumelanin (brown/black melanins) and phaeomelanin (red/yellow melanins) produced by the melanocytes. The melanocortin 1 receptor (MC1R) is a regulator of eu- and phaeomelanin production in the melanocytes, and MC1R mutations causing coat color changes are known in many mammals. We have sequenced the MC1R gene in 121 individuals sampled from world populations with an emphasis on Asian populations. We found variation at five nonsynonymous sites (resulting in the variants Arg67Gln, Asp84Glu, Val92Met, Arg151Cys, and Arg163Gln), but at only one synonymous site (A942G). Interestingly, the human consensus protein sequence is observed in all 25 African individuals studied, but at lower frequencies in the other populations examined, especially in East and Southeast Asians. The Arg163Gln variant is absent in the Africans studied, almost absent in Europeans, and at a low frequency (7%) in Indians, but is at an exceptionally high frequency (70%) in East and Southeast Asians. The MC1R gene in common and pygmy chimpanzees, gorilla, orangutan, and baboon was sequenced to study the evolution of MC1R. The ancestral human MC1R sequence is identical to the human consensus protein sequence, while MC1R varies considerably among higher primates. A comparison of the rates of substitution in genes in the melanocortin receptor family indicates that MC1R has evolved the fastest. In addition, the nucleotide diversity at the MC1R locus is shown to be several times higher than the average nucleotide diversity in human populations, possibly due to diversifying selection.

THOUGH human hair and skin pigmentation is a highly visible trait and is the primary protection against the noxious effects of ultraviolet radiation, little is known about the genetic variation responsible for the large array of pigmentation observed in human populations. Two classes of melanin, the red/yellow phaeomelanins and the black/brown eumelanins, are present in the epidermal layer of human skin and hair (Thodyet al. 1991). The spectrum of human hair/skin pigmentation observed in different geographic regions of the world is the result of varied production, distribution, and packaging of these two classes of melanin. Which melanin is produced by the melanocyte depends on the activity of the rate-limiting enzyme, tyrosinase, and the availability of sulfhydryl groups, cysteine and glutathione, at the third enzymatic step of the pathway (Prota 1980; Hearing and Jimenez 1987). Melanogenesis follows the phaeomelanin pathway in the presence of sulfhydryl groups, but is otherwise shunted toward eumelanin synthesis in their absence (Hearing and King 1993). The activity of various enzymes in the eumelanin pathway such as tyrosinase-related protein-1 and -2 and the subsequent polymerization of its intermediates dictate the type of eumelanin synthesized. The melanins, packaged in melanosomes, are transported through the dendrites and secreted into the neighboring keratinocytes, which gradually move toward the surface of the skin with their melanin packets, contributing to skin pigmentation. In hair bulbs, the melanosomes are deposited in the growing hair shaft. Thus, pigmentation is a result of the size, shape, and chemical makeup of the melanin in the keratinocytes and hair.

Although the environment, primarily in the form of sun exposure, plays an obvious role in human skin pigmentation (cf., Barnicot 1977), a genetic component plays a major role in skin color differences as indicated by the significant variation in pigmentation of unexposed skin and the skin of newborns from individuals of different regions of the world. Detailed anthropological studies of skin color from European descendants, East Asians, and those of African descent revealed that color differences are attributable to differences in the number, size, composition, and distribution of the melanosomes (Szabo 1967; Szaboet al. 1969). To understand the genetic component of skin color variation in individuals from different regions of the world, population genetic models have been employed. Harrison and Owen (1964) and Stern (1970) estimated that there are at least three loci involved in the pigmentation variation observed in descendants of mixed African and European ancestry alone.

In rodents, the amounts of eumelanin and phaeomelanin synthesized are controlled primarily by two loci, extension and agouti. The extension gene is expressed in melanocytes, producing the melanocyte stimulating hormone receptor (MSHR) or melanocortin-1 receptor (MC1R). MC1R is a member of the melanocortin receptor subfamily of G-protein-coupled receptors whose action on melanogenesis is mediated through the activation of adenylyl cyclase to elevate cAMP levels in melanocytes upon binding of the proopiomelanocortin-derived peptides, α-MSH and ACTH (Mountjoyet al. 1992). On the other hand, the agouti protein functions in a paracrine fashion in the neighborhood of the melanocyte, acting as an antagonist of MC1R (Luet al. 1994), and is the primary physiological switch from eumelanin to phaeomelanin synthesis in the mouse (Kobayashiet al. 1995). Though the human homologue of murine agouti has been cloned (Kwonet al. 1994; Wilsonet al. 1995), its role in human pigmentation is unclear. The human homologue of extension, MC1R, has been cloned (Chhajlani and Wikberg 1992; Mountjoyet al. 1992; Gantzet al. 1993) and mapped to chromosome 16q24 (Mageniset al. 1994). MC1R is expressed in human melanocytes (Mountjoyet al. 1992) and has high binding affinity for α-MSH and ACTH (Mountjoy 1994), both of which induce darkening of the skin through melanogenesis. Studies have shown that binding of melanocortin peptides to MC1R on human melanocytes stimulates melanogenesis (Abdel-Maleket al. 1995; Suzukiet al. 1996).

The MC1R locus might contribute to human pigmentation variation because there is evidence that MC1R variants are associated with pigmentation variation in other mammalian species. Three dominant darkening phenotypes in mouse are due to point mutations in the coding region of extension that cause the receptor to be constitutively active in two cases, and to be hypersensitive to α-MSH in the third case. A recessive yellow phenotype of the mouse is due to a frameshift mutation in extension rendering the receptor functionless (Robbinset al. 1993). MC1R variants have also been associated with the coat colors of cattle (Klunglandet al. 1995; Joerget al. 1996), fox (Vageet al. 1997), and horse (Johanssonet al. 1994).

While our study was underway, Valverde et al. (1995) reported nine variants of MC1R in red hair and fair skin humans of British ancestry. Subsequently, Cone et al. (1996) and Koppula et al. (1997) also reported two of the variants found in Valverde et al.'s (1995) study of individuals with fair skin (skin types I and II in the Fitzpatrick Clinic Scale; Jimbowet al. 1993). Recently, Box et al. (1997) reported nine additional variants in red hair/fair skin individuals. Association of genetic variation with phenotype in individuals from an isolated region can be skewed by population genetic phenomena such as founder effects. In contrast, our study was designed to randomly sample individuals from various regions of the world.

In our preliminary study we sampled six Africans, four Chinese, two Indians, and three European descendants (Ranaet al. 1996). The Africans showed a surprising lack of variation, while new variants were found in the Chinese studied. For these reasons and for the reason that little data on MC1R variation existed in populations outside of Europe, we examined the African and Chinese populations further. The distribution of alleles common to east Asia was also investigated by studying Indians and individuals from several populations around China that include Southeast Asian, Japanese, Mongolian, and Yakut. In addition, we examined an American Indian population to determine if the alleles associating with the Asian population existed before the split between the American Indian and Asian populations.

In addition to human MC1R, we have cloned and sequenced the MC1R gene in gorilla, pygmy chimpanzee, common chimpanzee, orangutan, and baboon to study the evolutionary history of MC1R and shed light on the role of selection on genetic variation in human pigmentation.

SAMPLES AND METHODS

Human samples: A total of 121 unrelated individuals from different world populations were screened for variation at the human MC1R locus. For samples that were collected in our lab, at least 10 ml of whole blood was collected with informed consent. Genomic DNA was prepared by proteinase K digestion of the buffy coat followed by NaCl extraction of proteins and ethanol precipitation of DNA according to protocol (Milleret al. 1988).

  • European ancestry samples: Genomic DNA was extracted from whole blood of two American individuals with red hair/fair skin, one British individual with light brown hair/skin type III, one American individual with light brown hair/skin type III, one Russian with light brown hair/skin type III, and one American individual with brown hair/skin type IV. Skin was typed according to the Fitzpatrick Clinic Scale.

  • African samples: Two Mbuti Pygmy from northeast Zaire (GM10493, GM10494A), three Biaka Pygmy from Central African Republic (GM10471, GM10472A, GM10473A), two Beninese (gift from Dr. R. Deka), four South Africans (Bantu-speaking individuals), one!Kung San from Tsumkwe region (gift from Dr. M. Ruvolo, Harvard University), two Nigerians (gift from Drs. Ralf Krahe and Michael Siciliano, M. D. Anderson Cancer Center, Houston), one Malian, two Kenyans, four Gambians (gift from Dr. John Clegg, Oxford University), and four Alurs.

  • Asian samples: A total of 50 Han Chinese individuals from Taiwan, north China, and south China; 20 individuals from seven regions of India (Andhra Pradesh, Bengal, Gujarat, Maharashtra, Punjab, Tamil Nadu, and Uttar Pradesh); four Japanese; four Mongolians; and two Southeast Asian individuals (one from Cambodia and one from Vietnam). Five Yakut individuals from four regions of Iakutia (or the Saha Republic of Russian Federation).

  • American Indian: Five Karitiana individuals from Brazil.

Primate samples: Genomic DNA from baboon (Papio cynocephalus; gift from Southwest Foundation) and orangutan (Pongo pygmaeus; tissue purchased from Emory Primate Center) was extracted from whole liver tissue. Gorilla (Gorilla gorilla), common chimpanzee (Pan troglodytes), and pygmy chimpanzee (P. paniscus) genomic DNA samples were supplied for our study by Dr. Jerry Slightom of the Upjohn Co., Kalamazoo, MI.

PCR amplification and sequencing: The entire coding regions of the human and nonhuman primate MC1R loci were obtained through PCR amplification using 200 ng genomic DNA, N-terminal primer (5′-ggaagaact gtggggacctggag-3′) and C-terminal primer (5′-taaggaac tgcccagggtcacac-3′) and standard concentrations of Taq DNA polymerase, MgCl, and buffer in a total volume of 50 μl. DNA was amplified for 35 cycles (1 min at 94°, 1 min at 61°, 1 min at 72° with a 2-sec extension at each cycle) in an automated DNA thermal cycler (Perkin Elmer-Cetus, Norwalk, CT). Two microliters of the first reaction was reamplifed using a set of nested primers, the N-terminal (5′-ggaggcctccaacgactccttc-3′) and C-terminal (5′-cagcacacttaaagcgcgt-3′), using the above conditions to yield a 1024-nucleotide product containing 5′ and 3′ flanking sequences.

PCR products were electrophoresed through a 1% agarose gel. An appropriate size band was cut and purified using the Prep-A-Gene DNA purification kit. Internal primers were end labeled with [γ-32P]ATP and both strands of the template were sequenced according to the Promega (Madison, WI) fmol DNA sequencing system protocol. Indian samples were sequenced using Perkin-Elmer Applied Biosystems 377 DNA sequencer with the BigDye terminator cycle sequencing protocol.

Haplotype determination: PCR products from subjects heterozygous at more than one variant site in the coding sequence of their MC1R gene were cloned into pBluescript and transformed into competent cells. At least five clones from each subject were analyzed by sequencing to determine the two haplotypes (alleles).

Sequence analysis: The Wisconsin Sequence Analysis Package Version 8 was used to estimate the number of synonymous (Ks) and nonsynonymous (Ka) substitutions per site between nucleic acid coding sequences. The program uses a variant of the method by Li (1993).

A parsimony tree indicating amino acid substitutions along the primate lineages was constructed by selecting the tree requiring the least number of substitutions as described in Li (1997) using primate sequences from this study as well as the MC1R sequences of mouse from Cone et al. (1996) and of cow, fox, and chicken obtained from GenBank (accession numbers U39469, X90844, and D78272).

RESULTS

MC1R variants identified: The entire coding sequence of the MC1R gene was determined in 121 individuals from various regions of the world and compared to published sequences (Chhajlani and Wikberg 1992; Mountjoyet al. 1992; Gantzet al. 1993; Coneet al. 1996). The human consensus sequence (H) was determined from all available sequences and is different from all four published sequences (Figure 1). Chhajlani and Wikberg (1992) reported an arginine at codon 164, whereas all samples sequenced in this study have an alanine at this position. Gantz et al. (1993) and Mountjoy et al. (1992) reported a threonine at codon 90, whereas this position shows a serine in all of our samples. Our human consensus sequence differs from that reported by Cone et al. (1996) at codon 163: arginine vs. glutamine. However, we did find a glutamine variant at codon 163, which will be discussed later. Thus, as with this codon 163 variant, differences from the consensus in the other reported sequences may represent additional alleles of human MC1R. However, because neither we nor others (Valverdeet al. 1995; Boxet al. 1997) have found these variants except that at codon 163, it is likely that they are either rare variants or sequencing errors.

Six variants of MC1R were found in our samples (Figure 2). First, Asp84Glu, caused by a C to A mutation at nucleotide 252, is a conservative substitution in the second transmembrane region of the receptor. Second, Arg151Cys, caused by a C to T mutation at nucleotide 451, is a change from a positively charged residue to an uncharged residue in the second intracellular loop domain. Third, Arg163Gln isaGtoA mutation at nucleotide 488, creating a charged to polar residue change in the fourth transmembrane domain of the receptor. The Arg67Gln/Arg163Gln variant is a combination of the Arg163Gln variant described above and a G to A mutation at nucleotide 200 resulting in a glutamine residue in the first intracellular loop domain instead of an arginine. Fifth, A942G results from an A to G synonymous substitution at nucleotide 942. Finally, the Val92Met allele is caused byaGtoA mutation at nucleotide 274, which creates a conservative amino acid change in the second transmembrane region of the receptor. In our samples, Val92Met was always observed with the 942G variant.

Figure 1.

—Consensus sequence (H) of human MC1R. Reported as (*)Thr in Mountjoy et al. (1992) and Gantz et al. (1993), (+)Pro in Mountjoy et al. (1992), (#)Arg in Chhajlani and Wikberg (1992), and (o)Gln in Cone et al. (1996).

Allele frequencies: There are two prominent features of the frequencies of variants within populations as shown in Table 1: (1) The association of the Arg163Gln variant with the East and Southeast Asian populations (Chinese, Japanese, Mongolian, Cambodian, Vietnamese, and Yakut) with an average frequency of 70%, and (2) the lack of variation in the African samples. Similar to African populations, the human consensus sequence was the most frequent in the Indian population. In addition, Arg163Gln is found in the homozygous state in all five American Indians studied. This is not surprising as it is generally accepted that the Americas were populated by nomadic Siberian populations from Northeast Asia. In contrast, this variant has a low frequency (7%) in the Indian population and was neither observed in any of the samples outside of Asia in this study, nor reported by Valverde et al. (1995) in a study of 135 British or Irish Caucasians, of which 60 samples were fully sequenced for their coding region. However, this variant was observed in the heterozygous state in one individual of fair skin/blonde hair and another of red hair by Box et al. (1997) and may be partly responsible for a phaeomelanin-rich skin phenotype. The Arg67Gln/Arg163Gln variant was found in only two individuals, both in the heterozygous state, in the Chinese population, but not observed in other populations in this study or previous reports. Because this variant is at low frequency and has only been identified on the Arg163Gln background, the Arg67Gln mutation may have arisen recently and may be unique to East Asians.

In their study of MC1R variants, Valverde et al. (1995) reported nine alleles associating with red hair/fair skin individuals. Val92Met and Asp84Glu were reported to be frequent in their red hair/fair skin samples. In this study, Asp84Glu was also found in a red hair/fair skin individual, but in no other individuals (Table 1), whereas the Val92Met variant was found in 23 Chinese individuals as well as in a British fair skin/light brown hair individual. Another variant, Arg151Cys, was observed in only the red hair/fair skin individuals in this study and, like Asp84Glu, was absent from other populations. This variant was also recently reported by Box et al. (1997) in individuals with red hair.

Figure 2.

—Variants of human MC1R. (a) The numbers indicate the nucleotide positions in the coding region. (b) Residues changed in variants found in this study are indicated by letters. Checkered residues are the locations of dominant darkening mouse mutants. Striped residue indicates the location of a mutant associated with dominant darkening of bovine coat color.

Primate MC1R sequences: To study the evolutionary history of MC1R and evaluate the significance of these variants, the MC1R homologues in pygmy and common chimpanzees, gorilla, orangutan, and baboon were sequenced (Figure 3). According to the neutral theory of molecular evolution, functionally less important parts of a gene evolve faster than functionally more important ones. Hence, these sequences as well as the sequences for horse, fox, cow, and mouse obtained from GenBank were aligned and compared. Surprisingly, all amino acids at the five polymorphic nonsynonymous sites are conserved among these different species of mammals, except that baboon has a methionine at codon 92 instead of valine, the same as human individuals with the Val92Met variant.

View this table:
TABLE 1

Frequencies of variants within populations

A parsimony tree indicating the amino acid changes along the primate lineages is shown in Figure 4. From this tree, the ancestral human MC1R protein sequence can be inferred to be identical to that of the human consensus sequence. However, at synonymous sites the ancestral nucleotide MC1R sequence is most likely the 942G variant because both chimpanzees and gorilla have a G at this site. The parsimony tree also indicates that the MC1Rs of gorilla, chimpanzees, and human have evolved at a faster rate than those of baboon and orangutan.

The primate data were also used to examine the possible effects of selection acting on MC1R. Synonymous sites usually evolve faster than nonsynonymous sites because the former have a lower chance of causing deleterious effects. Thus, if Ks is the number of substitutions per synonymous site between two coding sequences and Ka is the number of substitutions per nonsynonymous site, then the ratio Ka:Ks is usually <1. Indeed, the average ratio for 47 mammalian genes is only ∼0.20 (Li 1997). The Ka:Ks ratios for the MC1R gene were calculated for human, pygmy chimpanzee, common chimpanzee, gorilla, and orangutan and most comparisons were found to be at least two times higher than the average value for mammalian genes (Table 2). Because it is generally believed that the Ka:Ks ratio is higher in the primates, a better comparison would be to use a ratio derived from the primate lineage. Ohta (1995) computed the number of synonymous and nonsynonymous substitutions per site in the primate lineage using a total of 49 gene loci. The Ka:Ks ratio from these results is 0.27 and is lower than observed for the MC1R locus. These results indicate that MC1R is not subject to strong selective constraints. However, the high ratios might be partly due to advantageous substitutions (i.e., positive Darwinian selection), especially in the pygmy chimpanzee lineage, where Ka/Ks > 1. This possibility is strengthened by the polymorphism data considered below.

Figure 3.

—Protein sequence alignment of MC1R. (·) Sites identical to human consensus sequence; (–) gaps/deletions; (*) positions of human variants. Mouse sequence is from Cone et al. (1996), fox sequence is from Vage et al. (1997), cow sequence is from SWISSPROT (P47798), and sheep sequence is from EMBL (Y13965).

Figure 4.

—Parsimony tree indicating amino acid changes along primate lineages. Mouse, cow, and fox sequences were used as outgroups. The T-V change at codon 186 in the common ancestor of humans represents a change of two nucleotides from ACN to GTG. This with the A-T change in the common ancestor of human, chimpanzee, gorilla, and orangutan may represent three changes. An equally parsimonious alternate is a T-A change in both the baboon and orangutan lineages and a T-V change in the human lineage.

Nucleotide diversity within populations: In most genes, the observed frequency of synonymous changes is higher than that of nonsynonymous changes within a species. Among the 237 synonymous sites in the human MC1R gene, only 1 synonymous polymorphism (1/237 = 0.004) was observed. In contrast, 5 nonsynonymous polymorphisms were observed out of the 714 nonsynonymous sites (5/714 = 0.007), indicating a twofold higher degree of nonsynonymous polymorphism in this gene. Eight additional nonsynonymous polymorphisms were reported by Box et al. (1997), who had carefully examined synonymous substitutions as well and reported only one synonymous change, A to G at nucleotide 942 (identical to ours). In addition, Valverde et al. (1995) reported 6 other nonsynonymous polymorphisms that were not observed in our study or reported by Box et al. (1997). Hence, there are a total of 19 nonsynonymous polymorphisms reported here and by others, but only 1 synonymous polymorphism has been reported so far. This imbalance of nonsynonymous polymorphisms is significant at the 2.5% level. However, it is not clear whether Valverde et al. (1995) identified synonymous variants in addition to ours, so this particular analysis may be subject to sampling bias slightly lowering significance.

View this table:
TABLE 2

Ka:Ks ratios (above diagonal) and Ka and Ks values (below diagonal) for higher primates MC1R genes

This high level of nonsynonyomous polymorphism is in sharp contrast to the low level of polymorphism generally observed in humans. In a study by Li and Sadler (1991), the nucleotide diversities of the coding regions of 49 loci encompassing 54,193 bp were compared and the average level of nucleotide diversity per site (π) was computed from the pooled data. The π values ranged from a maximum of 0.11% for fourfold degenerate sites to a minimum of 0.03% for the nondegenerate sites. This indicated a very low nucleotide diversity in humans. We now compute the nucleotide diversity at the MC1R region, according to the equation π = [n/(n – 1)]Rxixjdij, where xi is the frequency of the ith allele, n is the number of sequences in the sample, and dij is the number of differences between the i and j alleles per nucleotide site (see Li 1997). We consider only our Chinese sample, because the other samples in our study are small and samples in other studies were not selected randomly. For the fourfold degenerate sites, π = 0.21%, and for the nondegenerate sites, π = 0.14%. As these are, respectively, two and five times higher than the corresponding average values for 49 genes studied by Li and Sadler (1991), the MC1R locus is indeed unusually polymorphic.

DISCUSSION

Rates of evolution of genes of the MC1R: To compare the rate at which MC1R has evolved with other genes, the Ka and Ks values were computed for known melanocortin receptors using human and rodent (mouse or rat) sequences from GenBank (Table 3). In each of the five receptors studied, Ka:Ks < 1. However, while the synonymous rate is within the same range for all the receptors studied, the differences in nonsynonymous rates between the receptors is greater; e.g., Ka is more than three times greater in MC1R than in MC4R. These results indicate that MC1R has evolved more rapidly than the other members of this receptor family. The greater constraints on the other receptors are likely due to their important physiological role as indicated by their expression in the adrenal cortex and central nervous system (Coneet al. 1996). MC1R would have more leeway as it is expressed only in the melanocytes and affects the type of melanins produced and, as far as it is known, has no other physiological role in which variation would be detrimental to the individual.

Biochemical significance of variants: The effect on pigmentation of the five MC1R variants observed in this study cannot be established without further association studies and functional assays. In addition, the end pigmentation pattern is most likely the result of a combination of effects of variation in and varied expression of many loci. However, their potential significance can be predicted from previous studies. The second transmembrane region is potentially important in ligand binding because two point mutations in this region of the mouse MC1R have given rise to two constitutively active receptors (Robbinset al. 1993), resulting in a dominant darkening of the coat. Further, Yang et al. (1997) have shown that mutagenesis of an acidic residue in the second transmembrane domain, specifically Glu94, significantly alters binding affinity and potency of α-MSH. Two of the variants reported here, Asp84Glu and Val92Met, are located in this region. Not only has Asp at position 84 been conserved among all the mammalian MC1R genes sequenced to date, it has also been conserved in the other four members of the melanocortin receptor family as well as in other G-protein-coupled receptors, such as the β2-adrenergic receptor. The Asp84Glu variant, here and in two more extensive studies of fair skin/red hair individuals (Valverde et al. 1995, 1996), is associated with fair skin/red hair individuals. The Val92-Met allele does not seem to be associated with any particular pigmentation type or geographical region in this study or others (Valverde et al. 1995, 1996; Coneet al. 1996; Boxet al. 1997) and has not been conserved among all species or between the melanocortin family of receptors. However, α-MSH binding assays on the Val92Met receptor expressed in COS-1 cells showed this variant to have an approximately five times lower potency in displacing the radiolabeled analogue of α-MSH as compared to the wild-type receptor (Xuet al. 1996). Further studies might show that Val92Met is associated with individuals with phaeomelanin-rich skin because this variant, as frequent as it is, has not yet been reported in individuals with highly eumelanin-rich skins as in those of African ancestry.

Box et al. (1997) were first to identify the Arg151Cys variant in the red-hair/fair skin population. It was identified in both the red-hair/fair skin individuals in our study, but not in the other populations. Our preliminary studies on the function of this receptor variant showed lower response to NDP-MSH, a potent synthetic analogue of α-MSH, which is consistent with this variant associating with phaeomelanin-rich skin.

The Arg163Gln variant may also be associated with phaemomelanin-rich skin because it has been identified only in such individuals thus far. If we consider populations in this study with phaeomelanin-rich skins to be the East and Southeast Asians, Yakut, and American Indians and those with eumelanin-rich skins to be the Indians and Africans, the association of the Arg163Gln variant with the phaeomelanin populations is significant at the 0.1% level. The arginine at position 163 is conserved in the MC1R of higher primates, cow, fox, horse, and mouse, although an arginine at this position is not essential for the function of other members of the melanocortin receptor family. No mouse mutations have been reported in this region, and site-directed mutagenesis studies on the transmembrane regions of MC1R have only targeted residues close to the extracellular segment of transmembrane region four (Frandberget al. 1994; Yanget al. 1997), while amino acid 163 lies closer to the intracellular loop region of transmembrane region four. So it is difficult to speculate on the importance of a substitution of an uncharged glutamine for a charged arginine at position 163.

View this table:
TABLE 3

Substitution rates in melanocortin receptor genes

Age of the Arg163Gln variant: Because the Arg163Gln variant is found in American Indians as well as in East and Southeast Asian populations, it probably arose considerably earlier than the split between the Asian and American Indian populations, which is estimated to be between 15,000 and 35,000 years ago (Cavalli-Svorza et al. 1994a). However, because the allele appears to be absent or at a very low frequency in both Europeans and Africans, it is probably not very old, perhaps younger than the divergence among the Asians, Europeans, and Africans. It is therefore intriguing that this allele is present at unusually high frequencies in both East Asians and American Indians. To see whether an allele can reach such a high frequency in a short time without selective advantage, we calculate the mean first arrival time of the Arg163Gln variant using frequency data from the Chinese population. The mean first arrival time is the mean number of generations until the frequency of the allele reaches a specific value for the first time starting from a low initial frequency. For a neutral allele, it is given by the following equation (Li 1975): t(p,y)=4Ne{[(1y)y]ln(1y)[(1p)p]ln(1p)}. We use y = 0.66 (the frequency of Arg163Gln in the Chinese population) as the current frequency of Arg 163Gln. The last term in the above formula approaches –1as p (the initial frequency of the variant) approaches zero. For an effective population size, Ne, in the range of 2000 to 5000 and for 20 yr per generation, the mean first arrival time is in the range of 71,100 to 175,200 yr. Note that the latter value is already much larger than the age of modern humans (∼100,000 yr), though the Ne value of 5000 is small compared to recent Chinese population sizes. Increasing Ne proportionally increases the mean first arrival time. As the allele is unlikely to be older than the origin of modern humans, it is quite possible that the allele has increased rapidly in frequency in East Asians by positive Darwinian selection.

Low replacement variation in African samples: We anticipated finding replacement variation in MC1R among black Africans for two reasons. First, because African individuals are at one extreme of skin color, there might be differences in the MC1R sequence between Africans and other populations. The results of this study show the opposite, in which African individuals possess the human consensus MC1R sequence, which is shared by most populations examined here.

Second, as African populations have been found to be generally more polymorphic than other populations (e.g., Shriveret al. 1997) and are commonly thought to be most ancient, a higher number of alleles of MC1R would be expected to have accumulated. However, the 25 African samples from various geographic regions revealed only one synonymous variant but no nonsynonymous variant. Thus, the African population has the lowest replacement variation. In contrast, individuals from other populations such as European ancestry and Chinese show replacement variation. This may support the vitamin D hypothesis (Cavalli-Sforzaet al. 1994b), which offers an explanation of how selection drove pigmentation to the variation observed today. Vitamin D is produced through photoconversion of the dietary precursor, 7-dehydrocholesterol, in the skin capillaries. Dark skin reduces the amount of UVR reaching the capillaries, which in turn reduces the amount of vitamin D formed. In regions of low sunlight, dark skin can be a disadvantage, leading to vitamin D deficiency and the childhood disease of rickets, which in more severe forms can reduce survival and reproduction. For those living in the tropics, dark skin can be an advantage as excess vitamin D may be toxic. It is possible that African populations retained their ancestral sequence to maintain a darker pigmentation for protection against both UVR and the toxic effects of vitamin D. As populations migrated to regions of lower sun exposure, the selection pressure for dark pigmentation was relaxed and mutations at MC1R that result in lighter pigmentation might be advantageous and increase in frequency. Local selection for specific alleles may explain why there is a higher variation of MC1R in non-African populations. It may also explain why there are many nonsynonyomous variants but only one synonymous variant observed in this and previous studies. It is difficult to infer selection independently through a single line of study presented here. Hence, we have taken the approach of integrating current knowledge on the physiology of skin pigmentation and the biochemistry of MC1R receptors, with our and existing population studies at MC1R. This and the contribution of the results from our evolutionary analyses, which include calculation of nucleotide diversity, a comparison of rates of substitution at MC1R with the other melanocortin receptors, and the analysis of the synonymous and nonsynonymous substitutions, all taken together indicate a role of selection in maintaining the large number of variants observed at MC1R.

Acknowledgments

We are grateful to Dr. Richard B. Clark for his invaluable discussion on G-protein-coupled receptors and to Drs. Craig Hanis and SongKun Shyue for their frequent assistance in collecting blood samples from which much of this data was obtained. We thank Will Clark for his assistance in cloning MC1R from P. troglodytes and Hongmin Sun, Jay Vivian, and Dr. Claudia Miller for their technical assistance. We thank Drs. Ira Gantz and Ying-Kui Yang for sharing with us their expertise in the melanocortin receptors without hesitation. This study was supported by National Institutes of Health grants (GM55759 and GM30998) to W.-H. Li., and the Betty Wheless Trotter Professorship.

Footnotes

  • Communicating editor: N. Takahata

  • Received July 6, 1998.
  • Accepted December 7, 1998.

LITERATURE CITED

View Abstract