In geographic areas where pied and collared flycatchers (Ficedula hypoleuca and F. albicollis) breed in sympatry, hybridization occurs, leading to gene flow (introgression) between the two recently diverged species. Notably, while such introgression is observable at autosomal loci it is apparently absent at the Z chromosome, suggesting an important role for genes on the Z chromosome in creating reproductive isolation during speciation. To further understand the role of Z-linked loci in the formation of new species, we studied genetic variation of the two species from regions where they live in allopatry. We analyzed patterns of polymorphism and divergence in introns from 9 Z-linked and 23 autosomal genes in pied and collared flycatcher males. Average variation on the Z chromosome is greatly reduced compared to neutral expectations based on autosomal diversity in both species. We also observe significant heterogeneity between patterns of polymorphism and divergence at Z-linked loci and a relative absence of polymorphisms that are shared by the two species on the Z chromosome compared to the autosomes. We suggest that these observations may indicate the action of recurrent selective sweeps on the Z chromosome during the evolution of the two species, which may be caused by sexual selection acting on Z-linked genes. Alternatively, reduced variation on the Z chromosome could result from substantially higher levels of introgression at autosomal than at Z-linked loci or from a complex demographic history, such as a population bottleneck.
AS incipient species diverge from each other under neutrality, a gradual loss of shared polymorphisms and accumulation of fixed differences are expected due to random genetic drift within each species (Wakeley and Hey 1997). Furthermore, levels of polymorphism and divergence across loci are expected to be correlated (Hudson et al. 1987). However, different parts of the genome may diverge at different rates. For example, if some loci are involved in adaptive evolution, or are linked to regions under selection, then these loci will show different patterns of variation from those that are evolving neutrally (Nachman 1997; Wang et al. 1997; Nachman et al. 1998; Fay and Wu 2000; Andolfatto 2001; Hey and Kliman 2002; Broughton and Harrison 2003; Schlötterer 2003). A useful approach to separate effects of selection from effects of demographic processes is to analyze patterns of polymorphism and divergence at several unlinked loci. Typically, selection will affect patterns of variation locally, near the loci under selection, whereas demographic processes affect the evolution of the entire genome.
Such multilocus DNA sequence data analysis is now a well-established tool to study selective forces acting in closely related species with a complex demographic history (Hey and Machado 2003). This approach has been effectively implemented to analyze regions associated with reproductive isolation in closely related Drosophila species (Begun 1996; Wang and Hey 1996; Kliman et al. 2000; Noor et al. 2001; Betancourt et al. 2002; Machado et al. 2002; Yi et al. 2003). These studies have demonstrated the existence of barriers to gene flow in regions of the Drosophila genome associated with regions important in speciation.
The pied flycatcher (Ficedula hypoleuca) and the collared flycatcher (F. albicollis) are two closely related, sexually dimorphic old-world flycatcher species. The role of selection in speciation has been investigated through analyzing interactions between the two species in sympatry. In regions where the pied (F. hypoleuca) and collared (F. albicollis) flycatcher have overlapping breeding areas (sympatric areas), a moderate level of gene flow has been observed between the two species at autosomal loci, whereas introgression of the Z chromosome is apparently absent (Saetre et al. 2003). A likely explanation for this pattern is that the Z chromosome is enriched with genes involved in species-specific adaptations. Accordingly, individuals with Z-linked genes introgressed from the other species would be removed by selection. In addition to the large role of the Z chromosome in reproductive isolation, it is also likely that Z-linked genes have diverged faster than the autosomes in these species because of sexual selection on Z-linked genes (Saetre et al. 2003). In the sympatric areas in Central Europe and on two islands in the Baltic Sea, there is evidence for a sexually selected character displacement on male plumage characteristics that increases assortative mating (Saetre et al. 1997). These male plumage characters are shown to be affected by Z-linked genes (Saetre et al. 2003) and have also been shown to be targets of sexual selection in allopatric populations (see, e.g., Saetre et al. 1997 and references therein). After two species have begun to diverge, traits involved in mate selection can contribute to reproductive isolation. It therefore seems likely that the emergence of reproductive isolation between the two flycatcher species has been accelerated by sexual selection.
To investigate this hypothesis we studied the evolutionary divergence of Z-linked and autosomal loci in pied and collared flycatchers that have been living in allopatry. We used pied flycatchers from Spain and collared flycatchers from Italy, which according to mtDNA divergence and a standard clock separated ∼2 million years ago (Saetre et al. 2001). These populations are geographically isolated from each other and are from the areas suggested to have housed the original populations isolated during the Pleistocene glaciations from where the pied and collared flycatchers recolonized northern Europe after the last glaciation period ∼10,000 years ago (Saetre et al. 2001). The birds from these allopatric populations are morphologically distinct from their conspecific sympatric populations, both in song and in male plumage characteristics (Saetre et al. 2003; Haavie et al. 2004). The geographical isolation of the two species means that these traits, which are involved in prezygotic isolation between the two species in sympatry, are likely to have evolved independently in the two populations, and that the patterns of genetic variation in each species have not been confounded by recent introgression. Relative rates of divergence at mtDNA and autosomal loci between these and other flycatcher species are in agreement with the geographic isolation hypothesis (Saetre et al. 2001, 2003). Current introgression is apparently restricted to narrow zones of secondary contact (steep clines) in central and eastern Europe and in the Baltic Isles (Saetre et al. 2003).
Here we present analyses of polymorphism and divergence from multiple introns from both Z-linked and autosomal loci in the pied and collared genomes. In total 20 introns located within 9 Z-linked genes (∼10 kb in total) and 25 introns from 23 autosomal genes (∼11 kb in total) in pied (n = 9) and collared (n = 9) flycatcher males were sequenced. In addition, one male red-breasted flycatcher (F. parva) is included as an outgroup species (see Saetre et al. 2001 for a justification). The evolutionary divergence of Z-linked genes is likely to have had significant effects on the development of barriers to gene flow between the pied and collared flycatchers (Saetre et al. 2003) and the results may shed light on the selective forces that result in the development of reproductive isolation.
MATERIALS AND METHODS
Male collared flycatchers (n = 9) and pied flycatchers (n = 9) were trapped at breeding grounds in Abruzzo National Park, Italy and near Madrid, Spain, respectively. Additionally, one male red-breasted flycatcher caught in Northern Moravia, Czech Republic, is included in the analysis and referred to as the outgroup species.
Twenty-five microliters of blood from each male were collected by brachial vein puncture and suspended in 1 ml Queen's lysis buffer (Seutin et al. 1991). DNA was extracted from the blood samples by overnight incubation of 400 μl blood/buffer mix at 50° with 500 μl of 0.1 m NaCl, 10 mm Tris-Cl (pH 8.0), 5% SDS, and 50 μl of proteinase K solution (10 mg/ml). Each sample was extracted with two rounds of phenol/chloroform treatment before DNA was recovered by ethanol precipitation, dried, and redissolved in TE buffer (10 mm Tris, 1 mm EDTA, pH 7.5).
Introns from Z- and A-linked genes:
On the basis of chicken (Gallus gallus) sequences in GenBank, primers were designed as described in Primmer et al. (2002), to amplify introns from different genes widespread on the chicken Z chromosome and from autosomal genes. A large number of primer pairs were tested but many were later discarded due to amplification failures or because insertion-deletion polymorphisms (INDELs) or repetitive sequence made sequence alignment impossible. Finally, a total of 20 introns from 9 Z-linked genes and 25 introns from 23 autosomal genes that could be reliably aligned in both directions (see below) were chosen for further analysis. Even though there is extensive conservation of synteny between chicken and mammalian genomes for both chicken macrochromosomes and microchromosomes (Burt et al. 1999; Groenen et al. 2000) and between chicken and emu (Shetty et al. 1999), the intrachromosomal structure on the Z chromosome does not seem to be stable (Nanda and Schmid 2002). Hence, the exact intrachromosomal positions of the flycatcher genes are not necessarily the same as those in the model organism (supplemental Tables S1 and S2 at http://www.genetics.org/supplemental/). However, the chromosomal location of each gene (autosomal or Z-linked) in the flycatchers was confirmed by typing female F1-hybrids, where fixed differences between species would be identifiable from heterozygous sites at autosomal loci but not at Z-linked loci. As some of the Z-linked genes have a W-linked gametologue we have used only males (the homogametic sex; ZZ) in the comparative sequence analyses. Details of all loci, primers, and PCR conditions are available as supplemental information (Z-linked, S1; and autosomal, S2 at http://www.genetics.org/supplemental/).
The introns were amplified in all 19 individuals in 10-μl reactions containing 2.5 mm MgCl2, 0.2 mm of each dNTP, 0.32 μm of each primer, 1 μg of bovine serum albumin (BSA), 0.3 units of HotStar DNA polymerase (QIAGEN, Valencia, CA), 1× PCR buffer (QIAGEN), and 20 ng DNA. On a PTC 225 (MJ Research, Watertown, MA) 35–40 cycles of amplification with 94° for 30 sec, 52–67° for 30 sec, and 72° for 1 min were preceded by 15 min predenaturation at 95° and followed by a prolonged 10-min extension step at 72°.
PCR products were purified with ExoSap-IT (United States Biochemical, Cleveland) and cycle sequenced with either BigDye terminator chemistry or the DYEnamic ET Terminator cycle sequencing kit (Amersham Biosciences, Arlington Heights, IL), depending on whether they were analyzed on an ABI 377 automated sequencer (Perkin-Elmer, Norwalk, CT) or on a MegaBACE 1000 (Amersham Biosciences), respectively. The sequences were aligned and edited in the programs AutoAssembler 2.1 (Applied Biosystems) and Sequence Navigator 1.0 (Applied Biosystems) or with Sequencher 4.1 (Gene Codes, Ann Arbor, MI) and modified manually. Each base in the study was called, using at least single-fold coverage sequencing reads for each strand. The only exceptions were for a few short regions where sequence repeats or INDELs made the reads in one direction of poor quality, and bases were called using information primarily from multiple sequences of one strand.
Diploid sequences from each individual in all three species at each locus were first separated into two “pseudo-haplotypes” on the basis of the ambiguity codes produced by Sequencher using a Perl script. This was done by randomly assigning the two alleles at each polymorphic site to one of two sequences to produce two haplotypes of unknown phase. The sequences derived from all three species at each locus were then aligned using ClustalW (Thompson et al. 1994) and improved by manual adjustment. A number of additional Perl programs were used during the analysis to extract polymorphism information from the sequences and to format them for use in data analysis programs.
Most polymorphism and divergence analyses, including tests of neutrality based on the allele frequency spectrum, were performed using the DnaSP program (Rozas and Rozas 1999). The H-test (Fay and Wu 2000) was performed using the htest program (http://crimp.lbl.gov/htest.html). Haplotype frequency estimation was performed using the expectation-maximization (EM) algorithm implemented in Arlequin (Schneider et al. 2000). DnaSP was used to calculate RM, the minimum number of recombination events in each sample. The program MEGA (Kumar et al. 2001) was used to construct neighbor-joining trees using inferred haplotypes. Multilocus Hudson-Kreitman-Aguadé (HKA) tests (Hudson et al. 1987) were performed using the HKA program (http://lifesci.rutgers.edu/∼heylab/). Fitting of the isolation model of speciation to the data was performed using the method described in Wakeley and Hey (1997) and Wang et al. (1997), using the program WH (http://lifesci.rutgers.edu/∼heylab/).
The data set:
Details of all loci in the study are shown in Table 1. These include 20 loci from 9 Z-linked genes (9657 bp in total) and 25 loci from 23 autosomal genes (11,539 bp in total). The majority of sequence is noncoding although some exon sequence is included in the analysis. Two collared flycatcher-exclusive nonsynonymous polymorphisms were found: one in the Z-linked locus GHR-exon 9 and one in the autosomal locus MPP-4 (see supplemental information supplemental Table S3 and supplemental Table S4 at http://www.genetics.org/supplemental/ for more details).
Description of polymorphism and divergence:
Summaries of polymorphism on Z-linked and autosomal loci are shown in Figure 1 and detailed in Tables 2 and 3, respectively (loci from the same gene are combined). A number of estimates of the population mutation rate, θ, are presented (θ = 4Neμ for autosomal loci and 3Neμ for Z-linked loci, where Ne is the effective population size and μ is the neutral mutation rate). These are based on the number of segregating sites, average number of pairwise differences, and allele frequency distribution weighted by derived alleles (Watterson 1975; Nei 1987; Fay and Wu 2000). Levels of variation in the pied flycatcher are slightly lower than those in the collared flycatcher at both Z-linked and autosomal loci (the variation in the pied is 93% of collared flycatcher levels at Z-linked loci and 76% of collared flycatcher levels at autosomal loci, on the basis of θW calculated from silent sites: 0.0015 and 0.0014 on the Z chromosome and 0.0040 and 0.0030 on autosomes in the collared flycatcher and the pied flycatcher, respectively). As divergence from the outgroup is about the same in both species, indicating that mutation rate is unlikely to vary between lineages, the lower variation probably reflects a lower effective population size in the pied flycatcher.
Genetic divergence from the outgroup on the Z chromosome is slightly higher than that on the autosomes (0.0176 and 0.0178 on the Z chromosome and 0.0157 and 0.0160 on the autosomes for the collared and pied flycatchers, respectively). A likely explanation for this is a higher mutation rate in males, because the male germline goes through many more rounds of cell divisions per generation than does the female germline (Haldane 1935). This is supported by a number of studies of evolutionary rates on the avian Z chromosome indicating a significant male bias in mutation rate (Ellegren and Fridolfsson 1997; Kahn and Quinn 1999; Carmichael et al. 2000) although the likely effect of different levels of ancestral polymorphism at Z-linked and autosomal loci makes an accurate estimate of the male bias difficult (Makova and Li 2002).
In contrast to patterns of divergence, levels of polymorphism on the Z chromosome are markedly lower than those on autosomes (Tables 2 and 3). When only silent sites are considered, θW is 0.0015 and 0.0014 on the Z chromosome and 0.0040 and 0.0030 on autosomes in the collared flycatcher and the pied flycatcher, respectively. Hence, levels of Z-linked variation are 38% of autosomal levels in the collared and 47% of those in the pied flycatcher. Under a standard neutral model assuming constant population size, random mating, and no migration (Watterson 1975) the higher mutation rate on Z would be expected to increase variation relative to the autosomes whereas the mode of inheritance of Z chromosomes would be expected to reduce it by a factor of . We calculated the expected variation at each Z-linked locus on the basis of the effective population size estimated from average levels of polymorphism and divergence from the outgroup at autosomal loci and the mutation rate estimated from levels of divergence from the outgroup at each individual Z-linked locus, using the formula(1)where SZ(exp) is the expected number of segregating sites at a specific Z-linked locus, DZ is the divergence between the species (collared or pied flycatcher) and the outgroup (red-breasted flycatcher) at this locus, SA is the total number of segregating sites at autosomal loci, and DA is the total divergence between the species and the outgroup at autosomal loci. The corresponding formula for estimating the expected number of segregating sites at an autosomal locus from average polymorphism and divergence on Z is(2)We used these equations to calculate the expected number of segregating sites at each locus in both species in both Z-linked and autosomal loci. As shown in Table 2, levels of polymorphism on the Z chromosome in both flycatcher species are about half of those predicted from levels of polymorphism and divergence on autosomal loci. In 14 of 18 possible cases, variation on Z is lower than expected on the basis of autosomal data (signs test, P = 0.0154). Accordingly, variation at autosomal loci (Table 3) is higher than expected on the basis of Z-linked data in 40 out of 46 comparisons (P < 10−6). Hence, variation at Z-linked loci is significantly reduced compared to that at the autosomes under a standard neutral model (see also Figure 1).
Allele frequency spectra:
We performed four tests of neutrality based on allele frequency distributions: Tajima's (1989) D-, Fu and Li's (1993) D- and F-, and Fay and Wu's (2000) H-tests (Tables 2 and 3). For Fu and Li's statistics and Fay and Wu's test, the outgroup was used to determine the ancestral state of diallelic polymorphisms. In the collared flycatcher, both Z-linked and autosomal loci exhibit slightly negative values of Tajima's D and Fu and Li's statistics, indicating that the allele frequency spectrum closely matches the neutral expectations with a slight skew toward rare alleles. In the pied flycatcher, autosomal loci exhibit the same pattern. However, at Z-linked loci, these statistics are all positive and Fu and Li's D and F for all loci combined both display significant deviations from the neutral model, indicating a deficit of rare derived alleles. Fu and Li's statistics are strongly dependent on the number of singleton polymorphisms at a locus and this significant result reflects the lack of singletons at Z-linked loci in the pied flycatcher.
Significant H-test results were obtained from a small number of Z-linked and autosomal loci, indicating an excess of high-frequency derived alleles. However, it is unclear whether these results have any biological relevance, considering the large number of tests performed, and they are likely to represent random evolutionary variance under neutrality. The same is likely to be true for individual loci with significant Tajima's and Fu and Li's tests.
We used multilocus HKA tests to determine whether levels of polymorphism and divergence are correlated between loci and species, as predicted under neutrality. All tests used only the polymorphism and divergence data from the collared and the pied flycatcher, and introns from the same gene were combined. To determine significance, the test statistic was compared to a distribution generated from 10,000 coalescent simulations. When only Z-linked loci were compared, significant heterogeneity was observed between loci (χ2 = 24.56, P = 0.023). Significant deviations were not observed between autosomal loci (χ2 = 16.44, P = 0.975). When combined autosomal and combined Z-linked loci from the same species were used, the test statistic was also not significant (χ2 = 1.27, P = 0.441).
The expected values of number of segregating sites and divergence between the species for the Z-linked locus from the significant HKA test are shown in Table 4. The VLDLR locus shows the greatest deviation from the neutral model: in the collared flycatcher the observed number of segregating sites is twice that expected whereas in the pied flycatcher it is less than half. When this locus is removed, the test is no longer significant (χ2 = 16.23, P = 0.143). In addition, a number of loci show smaller deviations from neutrality with respect to the observed number of segregating sites. In the collared flycatcher, ALDOB and CHDZ have lower than expected values, whereas in the pied flycatcher GHR has a low value and the values for PTCH and SPINZ are higher than expected.
Shared and fixed polymorphisms:
Variable sites within the two flycatcher species can be divided into four categories, those that are polymorphic only in the collared flycatcher, those that are polymorphic only in the pied flycatcher, those that are polymorphic in both (shared polymorphisms), and fixed differences. The numbers of sites in these different classes are shown in Table 5. This reveals a striking pattern: Z-linked loci have a large number of fixed differences but very few shared polymorphisms (Figure 1, Table 5), whereas the reverse is true for autosomal loci. Some shared polymorphisms are expected because of mutations on the same site in both species. However, closely related species are expected to show higher levels of shared polymorphisms because they have persisted since the time of divergence. The proportion of shared polymorphisms generated by multiple parallel mutations is expected to be very low (Clark 1997).
Under the isolation model of speciation (Wakeley and Hey 1997), as two populations diverge, shared polymorphisms are gradually lost and become fixed differences due to random drift. The isolation model assumes that an ancestral population has split into two separate populations at a certain time in the past, which then evolve independently according to the assumptions of a standard neutral model. The model invokes three separate values of the population mutation parameter, θ: one for the ancestral population, θA, and one for each of the descendant populations, θ1 and θ2, and a time, T, since the speciation event, measured in units of 2N1 generations (Table 6)(Wakeley and Hey 1997; Wang et al. 1997).
We performed isolation model fitting on the data, considering both all Z-linked loci and all autosomal loci as single loci. The loci were grouped in this way to test whether autosomal and Z-linked loci had significantly different patterns of polymorphism and divergence. Deviations from the isolation model were tested by comparing the true value of the WH statistic (Wakeley and Hey 1997) with values obtained from 10,000 coalescent simulations. This indicated that levels of divergence and polymorphism in the two species do not lead to a rejection of the isolation model of speciation (WH = 64.0, P = 0.43). Considering the collared flycatcher as species 1 and the pied flycatcher as species 2, the parameter estimates were θA = 152.1 (95% C.I.'s: 7.5–321.3), θ1 = 15.9 (0–549.5), θ2 = 12.2 (0–211.8), and T = 0.4 (0–1.3). Hence the data do not cause us to reject a model where θ can vary over time but that does not include selection or gene flow.
As sequencing was performed using a diploid template from male birds, the phase of heterozygous sites was unknown. We used the EM algorithm (Excoffier and Slatkin 1995) to estimate the frequencies of haplotypes found at each locus. Tables showing all polymorphic sites and estimated haplotypes at each locus are available as supplemental information (Z-linked, supplemental Table S3; and autosomal, supplemental Table S4; http://www.genetics.org/supplemental/). The majority of loci are shorter than 1 kb and do not contain enough segregating sites to reliably estimate the rate of recombination using the γ-statistic (Hey and Wakeley 1997). Using the four-gamete test (Hudson and Kaplan 1985), patterns of polymorphism at 5 Z-linked and 15 autosomal loci showed evidence for recombination.
Haplotype estimation allowed us to construct neighbor-joining (NJ) trees for all loci. Note, however, that as there is evidence for recombination at many loci, such diagrams cannot be regarded as accurate genealogies and are meant to serve as an illustration of the variability in patterns of variation between loci. Figure 2 shows the trees for Z-linked loci, which exhibit significantly heterogeneous patterns of evolution as demonstrated by the HKA test. In particular, the VLDLR, GHR, and SPINZ loci stand out as having different genealogies in the two species. NJ trees for all autosomal loci are provided as supplemental information (S5 at http://www.genetics.org/supplemental/).
We analyzed patterns of evolution at Z-linked and autosomal loci in two flycatcher species. By using multiple unlinked loci from various genomic locations we attempt to distinguish between certain demographic factors that affect the evolution of the entire genome and other factors that may act on a subset of loci, such as gene flow (introgression) and natural selection. Our main findings are (a) significantly reduced levels of genetic variation at Z-linked loci and (b) significant heterogeneity in patterns of evolution between Z-linked loci compared to the expectations of a standard neutral model. Below we evaluate the factors that may have influenced patterns of variation in the two species
Reduced variation at Z-linked loci:
Random genetic drift is expected to reduce variation and fix shared polymorphisms between two diverging species faster in a small population than in a large population (Wakeley and Hey 1997). Hence, as Ne at Z-linked loci is three-fourths that of autosomal loci due to the mode of inheritance of the Z chromosome, both lower levels of variation and a faster rate of fixation of shared polymorphisms are expected to some extent. In this study we have shown that a significant number of Z-linked loci have lower levels of variation than predicted by this effect under the assumptions of a standard neutral model.
A smaller male population size will reduce the value of Ne at Z-linked loci even further. A female-biased operational sex ratio has been observed among breeding flycatchers (i.e., some males mate with more than one female). Some males are polygynous, having one primary and one secondary female in different territories, whereas others may not mate at all. In a study of the collared flycatchers ∼4% of the females were classified as secondary (Qvarnström et al. 2003), while in the pied flycatcher, on average 10–15% of the females have been classified as secondary (Lundberg and Alatalo 1992). Another way to study the operational sex ratio is by measuring extra pair paternity (EPP). The mean rate of EPP was found to be 14.5% in collared flycatchers and slightly less in pied flycatchers (Veen et al. 2001). Both polygyny and EPP have been suggested to represent a potentially important source of sexual selection on male secondary sexual characters in the collared flycatchers (Gustafsson et al. 1995; Sheldon and Ellegren 1999).
If we assume that there is no operational sex ratio bias and the mutation rate at Z-linked and autosomal loci is the same, then under a neutral model assuming random mating, constant population size, and no migration, the ratio of neutral variation on Z:A is predicted to be 0.75. In the most extreme scenario, where only one male fertilizes all the females in the population, this ratio approaches 0.5. Moreover, it is likely that the mutation rate on the Z chromosome is higher than that at autosomal loci due to a male-biased mutation rate, which would cause these proportions predicted under neutrality to be higher. In this study we find the Z:A to be 0.37 in the collared flycatcher and 0.46 in the pied flycatcher (θW based on silent sites only). This is lower than what even the most extreme female-biased operational sex ratio could produce (Z:A = 0.5). It is therefore likely that either selection or biased gene flow has contributed significantly to the difference in levels of variation between the Z chromosome and autosomes.
A study of levels of variation at Z-linked and autosomal loci in chicken (Sundström et al. 2004) found the ratio of neutral variation on Z:A to be 0.24 and concluded that this was strong evidence for selective sweeps operating on the Z chromosome. This ratio is lower than the values for both pied and collared flycatchers presented here. It is likely that a large part of this difference can be explained by the effects of poultry breeding, which entails both a smaller male effective population size and an artificial selection regime. Here we have demonstrated that levels of variation at Z-linked loci are also significantly reduced in populations of two wild bird species.
Isolation model fitting:
Under the isolation model of speciation, as two populations diverge, shared polymorphisms are gradually lost and become fixed differences due to random drift. This process occurs faster when effective population size is low. Hence, under this model, Z-linked loci are expected to exhibit fewer shared polymorphisms, as their effective population size is three-fourths of that of autosomal loci due to the mode of inheritance of the Z chromosome. However, a number of factors may cause populations to violate the assumptions of the isolation model. Such factors include unequal variance in mating success between males and females, gene flow between species, and selective sweeps.
The pied and collared flycatcher populations share polymorphisms at a number of loci analyzed in this study. However, shared polymorphisms are almost exclusively located in autosomal loci (51 shared polymorphisms compared with 6 fixed differences; Figure 1). In contrast, Z-linked loci have very few shared polymorphisms but a large number of fixed differences between species (2 shared polymorphisms compared with 21 fixed differences). Despite these differences, isolation model fitting indicates that these values are compatible with the isolation model of speciation. This suggests that patterns of polymorphism and divergence could result from a scenario without gene flow or selection but where historical changes in θ (possibly due to changes in Ne) generate a greater degree of stochastic variance in these patterns. This contrasts with our previous finding that under the stricter assumption of constant population size, variation at Z-linked loci is significantly reduced. We now consider the potential influence of demography and selection in generating the low levels of variation observed at Z-linked loci.
It is believed that the pied and collared flycatchers and two other flycatchers in the Ficedula flycatcher species complex—the Atlas flycatcher (F. speculigera), located in North Africa, and the semicollared flycatcher (F. semitorquata), which breeds in the areas around the Black Sea—arose from a single ancestral population (Saetre et al. 2001). This ancestral population probably had a wide preglacial breeding range in the Old World, which became fragmented, followed by the expansion of the Ficedula species complex from glacial refugia. The results from the isolation model of speciation are in qualitative agreement with this population history scenario, suggesting that the two flycatcher species analyzed here are descendants of an ancestral, much larger population. However, as selection and gene flow may also have affected patterns of variation at a subset of loci, interpretation of ancestral population parameters is problematic.
Values of Tajima's D and Fu and Li's D and F in the collared flycatcher population are slightly negative, indicating an excess of rare variants, which could potentially reflect a bottlenecked expansion of this population from a preglacial ancestral population. However, in the pied flycatcher, whereas autosomal loci show slightly negative values of these statistics (in concordance with the collared sample), Z-linked loci generally have positive Tajima's D-values, indicating an excess of common variants. It is possible that a change in the population size affected patterns of variation at autosomal and Z-linked loci differently because the effective population size of the latter is smaller than that of autosomal loci. For example, a recent reduction in population size is expected to cause a larger initial increase in Tajima's D at loci with smaller Ne, and such effects may explain the differences in patterns of variation observed between human mtDNA and nuclear loci (Fay and Wu 1999).
Historical changes in population size violate the assumptions of the standard neutral model. Our comparison of levels of variation at Z-linked and autosomal loci assumes constant population size. Furthermore, certain demographic scenarios such as bottlenecks may increase the variance expected in patterns of polymorphism and divergence and thus reduce the power of the HKA test to reject neutrality (Hammer et al. 2004). Our findings of significant differences both between levels of variation at Z-linked and autosomal loci and in patterns of evolution between different Z-linked loci may therefore be caused at least in part by the demographic history of the two flycatcher species.
A further potential explanation for low levels of genetic variation at Z-linked loci is the presence of a greater degree of gene flow at autosomal loci compared with genes on the Z chromosome. Biased gene flow could occur between two incipient species in an ancestral population or could be ongoing. When polymorphisms are shared between two species because they have persisted in both populations since the time of divergence, we expect levels of linkage disequilibrium (LD) between them to have decayed, since they are relatively old. However, when shared polymorphisms arise by recent gene flow between populations, we expect entire haplotypes to be transferred and thus greater LD. As the regions involved in this study are generally too short to allow reliable comparisons of LD in shared and nonshared polymorphisms, it was not possible to perform the LD test of gene flow as described by Machado and Hey (2003). However, some insight can be gained by examination of the NJ trees (Figure 1 for Z-linked loci; autosomal loci are available as supplemental information, S5 at http://www.genetics.org/supplemental/) and of patterns of variation in the inferred haplotypes (haplotype tables are available as supplemental information, supplemental Table 3 and supplemental Table 4 at http://www.genetics.org/supplemental/). Z-linked loci have only two shared polymorphisms in total. It is thus highly unlikely that recent gene flow has occurred. The majority of polymorphisms at autosomal loci are shared between populations and at 17 of 25 loci there is evidence for haplotype sharing. In general, however, it is the ancestral haplotypes (or their close derivatives) that are shared, and there is no evidence for sharing of haplotypes that are more recently derived, by either mutation or recombination. Hence, we have no evidence to suggest that recent gene flow has contributed to patterns of extant variation.
Data from present-day hybrid zones in northern Europe show that the rate of introgression is higher on autosomal markers than on Z-linked ones (Saetre et al. 2003). Previous analyses of genotype data suggest that it is very unlikely that the Italian and Spanish populations investigated here are affected by introgression that currently occurs in flycatcher hybrid zones (Saetre et al. 2001, 2003). However, we cannot rule out the possibility that historic episodes of introgression (e.g., during warm interglacials) have contributed to elevating levels of polymorphism at autosomal loci relative to that at Z-linked loci.
The finding of slightly lower variation in the pied compared to the collared flycatcher could indicate a smaller effective population size in the Spanish pied flycatcher population. This has previously been suggested on the basis of a comparison of microsatellites and mtDNA in pied flycatcher populations in Europe (Haavie et al. 2000). The pied flycatcher has a patchy distribution on the Iberian Peninsula, living in isolated mountain forests separated by the Pyrenees from continental populations (Potti and Montalvo 1991). This might give a population structure that is vulnerable to factors such as environmental stochasticity, leading to a reduced genetic variation.
Natural selection and the evolutionary history of the Z chromosome:
The action of selective sweeps on the Z chromosome is supported by the HKA test, which indicates significant heterogeneity in patterns of evolution between different Z-linked loci, suggesting that selection has affected Z-linked loci to varying extents. Hence it is possible that natural selection has reduced levels of variation on the Z chromosome of both the pied and collared flycatcher compared to autosomal levels and that the signature of selection varies between loci and between species. It is, however, important to note the potential effect of demography before rejecting neutrality (see above).
Both selective sweeps and background selection affect larger genomic regions when the recombination rate is low (Maynard Smith and Haigh 1974; Charlesworth et al. 1993), and selection has been cited as a reason for the positive correlation observed between variation and recombination in various species (see Begun and Aquadro 1992; Stephan and Langley 1998; reviewed by Nachman 2001). Data from comparisons of physical and genetic maps in the chicken genome suggest that the average rate of recombination on the Z chromosome is ∼2.5 times lower than the genomic average (Smith and Burt 1998; Groenen et al. 2000; Schmid et al. 2000; Smith et al. 2000; Sundström et al. 2004). As flycatchers also possess a similar karyotype to the chicken, comprising a few macrochromosomes and many microchromosomes, it is likely that the recombination rate on Z chromosomes in flycatchers is also lower than the average autosomal rate. The effects of selection on linked variation could therefore potentially extend over larger genomic regions on the Z chromosome compared with autosomes.
As female birds only possess one Z chromosome, recessive deleterious mutations are exposed to natural selection and thus can be efficiently removed from a population. Hence, assuming that recessive deleterious mutations arise at similar rates at Z-linked and autosomal loci, the Z chromosome should exhibit lower average levels of deleterious polymorphism compared with autosomal loci. Neutral variants at Z-linked loci are therefore less likely to be linked to a deleterious mutant compared with neutral alleles on autosomes. We therefore predict that background selection should have a smaller impact on neutral variation on the Z chromosome than at autosomal loci (Charlesworth et al. 1993). It is therefore unlikely that the reduced levels of variation on the Z chromosome are due to the action of background selection; they are more consistent with the effects of recurrent selective sweeps. In contrast, positive selection is predicted to be more effective on Z-linked loci compared to autosomal loci because recessive positive mutations are not masked by dominance in hemizygous females (Charlesworth et al. 1987; Servedio and Saetre 2003). Thus, assuming that recessive positive mutations arise at a similar rate at Z-linked and autosomal loci, the Z chromosome is predicted to experience selective sweeps more often than autosomal chromosomes.
The sex chromosomes are known to possess a relative excess of genes involved in reproduction, sexual conflict, and male secondary sexual traits (Sperling 1994; Prowell 1998; Reinhold 1998; Ritchie and Phillips 1998; Civetta and Singh 1999; Wang et al. 2001; Gibson et al. 2002; Lercher et al. 2003; Saetre et al. 2003). Rice (1984) suggested that traits that are beneficial to one sex, but detrimental to the other are predicted to accumulate on the sex chromosomes. In addition, both theoretical predictions and experimental observations have shown that genes for such traits exhibit high rates of adaptive evolution (Civetta and Singh 1998; Begun and Whitley 2000; Singh and Kulathinal 2000; Wyckoff et al. 2000; Swanson and Vacquier 2002a,b; Torgerson et al. 2002; Meiklejohn et al. 2003; Torgerson and Singh 2003). In birds, Z-linked male sexual traits are transmitted directly from father to son, which may facilitate sexual selection on these traits (Reeve and Pfennig 2003). Furthermore, if a male sexual trait and the female preference for that trait become linked on the Z chromosome, then selection can act to rapidly fix the combination in a population (Servedio and Saetre 2003).
Levels of silent variation at Z-linked loci in the pied and collared flycatcher, respectively, are 46 and 37% of levels observed at autosomal loci. This is lower than predicted under the most extreme female-biased operational sex ratio possible, where a relative level of 50% is expected. Furthermore, an HKA test indicates significant heterogeneity between patterns of polymorphism and divergence at Z-linked loci. These observations are not compatible with a standard neutral model. One potential explanation is the recurrent action of selective sweeps on the Z chromosome. However, it is also likely that the population sizes of these species have not remained constant, suggesting that a demographic scenario such as a population bottleneck could also play a major role in generating the observed patterns of variation.
The authors thank Centro Studi Ecologici Appenninici, J. Moreno, J. Haavie, and K. Räsänen for field assistance and Jody Hey and Göran Arnqvist for comments on the manuscript. Financial support was received from the Swedish Research Council, the Norwegian Research Council, and O. & L. Lamms Memorial Foundation.
- Received May 2, 2005.
- Accepted May 12, 2005.
- Copyright © 2005 by the Genetics Society of America