We applied multilocus sequence typing (MLST) to investigate the population structure and mode of reproduction of Cryptococcus neoformans var. grubii (serotype A). This MLST system utilizes 12 unlinked polymorphic loci, which are dispersed on nine different chromosomes, and allows the unambiguous identification of closely related strains of serotype A. We compared MLST analyses with the conventional genotyping method of detecting amplified fragment length polymorphisms (AFLPs), and there was excellent correlation between the MLST and AFLP results. However, MLST differentiated a larger number of strains. We analyzed a global collection of isolates of serotype A using both methods, and the results identified at least three genetically distinct subpopulations, designated groups VNI, VNII, and VNB. Groups VNI and VNII are widespread, dominated by isolates with the MATα mating type, and predominantly clonal. Conversely, isolates of group VNB are unique to Botswana, include a significant proportion of fertile strains with the MATa mating type, and manifest compelling evidence of recombination. We have AFLP genotyped >1000 strains of serotype A from different parts of the world, including isolates from several African countries, and, to date, haploid serotype A isolates of group VNB have been found only in Botswana.
CRYPTOCOCCUS neoformans is a pathogenic yeast that causes debilitating disease of the central nervous system and other organs in humans. Cryptococcosis is especially prevalent in persons with impaired cell-mediated immunity, such as patients with AIDS, transplants, or hematologic malignancies (Casadevall and Perfect 1998). C. neoformans normally resides in the environment, where it is most commonly associated with avian guano and vegetative debris. Infection is acquired by inhalation; however, neither the natural reservoir nor the nature of the infectious particles has been resolved (Casadevall and Perfect 1998). On the basis of differences in capsular epitopes and clinical manifestations, two varieties and three serotypes of C. neoformans have been recognized (Kwon-Chung et al. 2002). The most common variety, C. neoformans var. grubii, includes isolates of serotype A, which are widespread and responsible for >80% of cryptococcal infections and for >99% of the infections in patients with AIDS. The other variety, C. neoformans var. neoformans, includes strains of serotype D, which also infects immunocompromised individuals; however, the prevalence of this variety is lower, and it is considered less pathogenic (Casadevall and Perfect 1998). Strains of serotype AD are hybrids between strains of serotypes A and D and least common, but they have been isolated from the environment and patients (Lengeler et al. 2001; Xu et al. 2002).
Although C. neoformans var. grubii has been studied extensively, its population structure is unresolved. In particular, the mode of reproduction and the extent of clonality among natural isolates have not been determined. C. neoformans is a haploid basidiomycete with a bipolar mating system and two alternative mating-type alleles, MATa and MATα. Although the genome of C. neoformans var. grubii contains the machinery for sexual reproduction and recombination, and strains are capable of undergoing both processes in the laboratory, the overwhelming majority of natural isolates possess only one mating-type allele, MATα, and exhibit significant evidence of clonality. Several genotyping techniques have been used to analyze different populations of C. neoformans, and the collective results indicate widespread clonality in the population (Currie et al. 1994; Brandt et al. 1996; Boekhout et al. 2001; Meyer et al. 2003; Litvintseva et al. 2005). Nevertheless, some studies also found evidence of recombination in natural populations of C. neoformans. For example, Xu et al. (2000) demonstrated significant incongruence among the genealogies of four unrelated genes. Burt et al. (2000) analyzed multilocus genotypes of 222 serotype A isolates from the United States and detected no significant linkage disequilibrium among the loci, which may indicate evidence of recombination in the population. Litvintseva et al. (2005) analyzed multilocus amplified fragment length polymorphism (AFLP) genotypes of >700 environmental and clinical isolates of C. neoformans serotype A from the United States and detected linkage equilibrium among the loci in the individual subgroups, which could be attributed to recombination in these subgroups. In 2003, we found circumstantial evidence of recombination in a clinical population of C. neoformans var. grubii from Botswana (Litvintseva et al. 2003). Fourteen isolates, composing 10% of this population, contained the rare MATa mating-type allele, possessed eight different AFLP genotypes, and were capable of mating and recombination in the laboratory. Moreover, analysis of all 139 isolates from this sample revealed the presence of two partially genetically isolated subgroups, which exhibited evidence of both clonal expansion and recombination. After decades of research and mystery, here was a population of C. neoformans var. grubii (i.e., serotype A) with the capability for sexual reproduction.
The discovery of these unusual clinical isolates in Botswana stimulated the present investigation to determine the global prevalence or rarity of strains with the potential for sexual reproduction and recombination. We analyzed 102 isolates of serotype A from different parts of the world, including 34 previously described Botswanan strains with unique AFLP genotypes. To assess the genetic relationships among these isolates, we used two independent genotyping methods: AFLP and multilocus sequence typing (MLST), for which we developed 12 unlinked MLST loci. Both AFLP and MLST genotypes were used to analyze the population structure of this global sample. There was good agreement between the data and results obtained by the AFLP and MLST methods. Both techniques demonstrated that the population in Botswana is unique and consists of isolates that were not found in any other part of the world. MLST analyses confirmed the previous evidence of subgroups among the Botswanan isolates and provided strong evidence for genetic recombination in the population as the genealogies of several genes revealed significant incongruence. Conversely, the remaining global strains of serotype A were overwhelmingly clonal. Identical strains were isolated from distant parts of the world, and the gene genealogies of all 12 analyzed genes were generally congruent. However, the global population was separated into two genetically distinct groups, and these two groups differed from the subgroups in Botswana.
MATERIALS AND METHODS
Isolates of C. neoformans:
A total of 1085 isolates of C. neoformans var. grubii (serotype A) were analyzed by the AFLP genotyping. Among them, 824 isolates were obtained from the environmental and clinical populations in the United States, described previously (Litvintseva et al. 2005), and 139 were cultured from spinal fluid specimens of individual patients in Botswana, also described previously (Litvintseva et al. 2003). The remaining strains were isolated from clinical specimens and environment samples from 13 other countries (Table 1). A subset of 102 strains from different countries and with different AFLP genotypes was selected for the MLST analyses (Table 2). We also evaluated multiple strains of the same AFLP genotype to assess the sensitivity and concordance of both genotyping techniques. In addition, VNI and VNII standard strains (Meyer et al. 2003) were included to provide a link to studies performed by others. To root the maximum-parsimony tree, we used the recently sequenced JEC21 strain of serotype D (Loftus et al. 2005). Isolates were maintained on yeast extract–peptone–dextrose (YPD) agar medium (Difco, Detroit) at 30°.
DNA manipulations and AFLP:
Genomic DNA was extracted from each isolate and the AFLP analysis was performed as described (Litvintseva et al. 2003, 2005). Only intense and reproducible bands were scored for the analyses of population structure. Polymorphic AFLP bands were defined as bands of the same size that were present in some but not all isolates. To assess the reproducibility of the AFLP method, DNA was extracted and the AFLP reactions and analyses were performed on at least two separate occasions for each isolate. In comparing replicate analyses, 92% of the AFLP bands were identical (data not shown).
Twelve unlinked MLST loci on nine of the chromosomes were selected for the analysis (C. neoformans H99 sequencing project, Duke IGSP Center for Applied Genomics and Technology, http://cgt.duke.edu/; Table 3). The following criteria were applied to the selection of MLST loci: (i) the primer-binding sites were designed to be situated within protein-coding sequences to maximize the number of strains for which a particular locus can be PCR amplified; (ii) each MLST locus contained a number of variable noncoding DNA regions, such as introns or intergenic regions, to maximize the number of strains that can be distinguished by this genotyping approach; and (iii) the MLST loci were selected so that they were physically unlinked, i.e., dispersed on different chromosomes or separated by at least 100,000 nucleotides (Marra et al. 2004), to test for linkage equilibrium among the loci in the population. The PCR primers and amplification conditions are shown in Table 4. Each PCR mixture contained 32 μl of 1× PCR buffer, 2 mm MgCl2, 0.2 mm dNTPs, 1 μm each primer, 0.065 μl iTaq DNA Polymerase (Bio-Rad, Hercules, CA), and ∼1 ng genomic DNA. PCR products were purified using the QIAquick PCR purification kit (QIAGEN, Valencia, CA) and sequenced using an ABI 3700 sequencer with Big Dye terminators (Applied Biosystems, Foster City, CA). For most loci, PCR primers used for the amplification of the fragments were also used for sequencing. The only exceptions were MP88 and CAP59, for which the following primers were used to obtain the complimentary DNA: MP88-seq-f, 5′-TCCTCTTTTACTGGCCGTAT (forward orientation), and CAP59-seq-r, 5′- GGTACTGCGCTCGAGAATGC (reverse orientation). For all of the loci, sequences were generated from both DNA strands and edited manually. Unique MLST sequence types are listed in supplemental Table 1 (http://www.genetics.org/supplemental/) and deposited in GenBank under accession nos. DQ212527–DQ212692.
Sequences were automatically aligned using Sequencher 4.1 (Gene Codes, Ann Arbor, MI); the alignment was imported into MacClade 4.05 (Maddison and Maddison 1989) and edited manually. Ambiguously aligned characters were excluded from the analysis. MLST alleles were assigned to every unique sequence type at each locus, and a 12-digit number designated the allelic profile of each isolate (supplemental Table 1 at http://www.genetics.org/supplemental/; Enright and Spratt 1999; Taylor and Fisher 2003). The genetic relatedness among the AFLP and MLST genotypes was evaluated by nonmetric multidimensional scaling (MDS) analysis using Euclidian distance measures and by principal component analysis (PCA) with the correlation matrix, using Community Analysis Package 2.4 (PISCES Conservation, Hampshire, UK) (Hebert et al. 2002).
Phylogenetic analyses were performed with PAUP version 4.0b10 (Swofford 1996). Maximum-parsimony (MP) trees for the individual loci were identified with heuristic searches based on 500 random sequence additions for each data set; gaps in the sequence alignment were collapsed to a single character and included in the maximum-parsimony analysis as a fifth character. Strict consensus trees for the 12 genes were compared for topological congruence; taxa were deemed in conflict when they showed different relationships in two genes that were supported by a bootstrap value of ≥70% (Mason-Gamer and Kellogg 1996). In addition, phylogenetic congruence among the 12 gene genealogies was tested by the partition homogeneity test with 1000 bootstrap replicas [incongruence length difference (ILD) test] implemented in PAUP (Swofford 1996). Because of the observed incongruence in the gene genealogies of several genes, combined sequence data for all 102 isolates were analyzed with the neighbor-joining (NJ) method using uncorrected (“p”) genetic distances (Nei and Kumar 2000). Sequences of the 12 MLST loci from 92 strains of serotype A that had congruent gene genealogies were aligned with those of the recently sequenced JEC21 strain of C. neoformans serotype D (Loftus et al. 2005); 10 strains that demonstrated significant incongruence among the gene genealogies were excluded from the alignment. Maximum-parsimony trees were generated for the representative strains of each unique genotype and rooted with the JEC21 strain of serotype D. Statistical support for the phylogenetic groups was assessed by bootstrap analysis using 500 replicate data sets.
Hierarchical analysis of molecular variance (AMOVA) was performed with the Arlequin 2.0 software package (Schneider et al. 2000). The total variance was partitioned into variance among individuals within populations, variance among populations, and variance among groups of populations. Variance components were calculated for the following comparisons: (i) the three groups and five populations based on the phylogenetic structure depicted in Figures 1 and 2, where group VNB contains two populations, subgroup VNB-A (n = 9) and subgroup VNB-B (n = 7), group VNI also contains two populations, global VNI (n = 60) and the Botswanan clade of VNI (n = 9), and group VNII contains one population, VNII (n = 10); and (ii) the two groups and four populations based on geographic origin, where group I contains all isolates from Botswana (n = 34), and group II contains isolates from three populations, Europe (n = 12), Africa (excluding Botswana, n = 16), and North America (n = 24). The significance of the estimated variance of each component was tested using a nonparametric permutation method with 1000 permutations (Excoffier et al. 1992). In addition, pairwise Wright's fixation indexes (FST) were calculated for the pairs of populations defined by phylogenetic analysis or geographic origin using Arlequin 2.0 (Hartl and Clark 1997). The significance of each FST-value was tested by a nonparametric permutation method with 1000 permutations (Schneider et al. 2000).
To evaluate the association among loci in each sample, we used the index of association (IA) and the maximum-parsimony tree length (MPTL) test (Maynard Smith et al. 1993; Burt et al. 2000). IA-values were calculated using Multilocus 1.2 software, and 1000 artificially recombined data sets were used to determine the statistical values of the tests. The MPTL test was performed with PAUP; 1000 permutations were used to determine the statistical value of the test.
AFLP genotyping confirmed the existence of two genetically isolated groups in the global population and revealed unique and diverse strains of C. neoformans serotype A in Botswana:
We used AFLP genotyping with two independent primer pairs to investigate the genotypic diversity among 1085 strains of C. neoformans serotype A isolated from different parts of the world, including 139 (13%) isolates from Botswana (Table 1). Forty-five polymorphic bands were generated, which delineated 47 unique AFLP genotypes (Table 2). Among these genotypes, 34 (72%) were found only in Botswana, and the remaining 13 occurred in other countries. Among these 34 AFLP genotypes unique to Botswana, 12 genotypes were found in single isolates, and 22 were represented by more then one isolate (Litvintseva et al. 2003). Conversely, among the remaining 13 AFLP genotypes found in other countries, only 2 genotypes did not have clonemates, namely, isolates in2632 from India and JH125.91, a rare MATa strain from Tanzania (Nielsen et al. 2003). The remaining 11 AFLP genotypes were identified in multiple isolates from different countries and continents (Table 2).
Genetic relationships among the 47 different AFLP genotypes are visualized by the nonmetric MDS plot (Figure 1A). Forty-four of these strains grouped into three genetically isolated subpopulations: group VNI, which includes the VNI standard strain (Meyer et al. 2003), group VNII, which includes the VNII standard strain(Meyer et al. 2003), and group VNB, which is unique to Botswana (Figure 1A). Isolates from group VNI were found in Botswana as well as globally. Members of group VNII were rare but isolated from the United States, Australia, and Uganda (Figure 1A). The Botswanan and global isolates in group VNI are related to one another, but most of the Botswanan isolates of this group have unique genotypes. Three isolates from the Botswanan population (bt125, bt131, and bt68) possess AFLP patterns that are characteristic of both group VNB and group VNI, and they may have arisen from recombination between members of the two groups. The same three groups (VNI, VNII, and VNB) were delineated by using PCA (data not shown).
Overall, the population in Botswana has the highest number of unique AFLP genotypes, and they have not been found anywhere else, including other African countries. For example, among the 55 strains of serotype A from Tanzania (n = 14), Malawi (n = 15), Uganda (n = 21), and the Democratic Republic of Congo (n = 5), only six different AFLP genotypes were identified, and none had genotypes similar to those in Botswana (Table 1).
Development of the MLST genotyping system:
The 12 gene sequences analyzed totaled 6835 nucleotides from which we identified 239 polymorphic sites. Among these multilocus gene sequences, SOD1 was the most variable with 30 polymorphic sites, and LAC1 was the least variable with 12 polymorphic sites (Table 3).
Among the 102 strains examined, 57 different MLST genotypes were identified: 32 (56.1%) occurred among strains in Botswana and 25 in the other countries (Table 2). There was good correlation between the MLST and AFLP genotypes. For example, with two exceptions from Botswana (bt57 and bt104), every strain with a unique AFLP genotype had a unique MLST genotype. Similarly, strains with the AFLP genotype A5 always had MLST genotype M5. There were multiple strains for which the MLST genotype was more discriminatory than the AFLP genotype. For example, strains with AFLP genotype A3 possessed MSLT genotypes M1, M3, or M3a. AFLP genotype A4 was the most polymorphic, since it encompassed 7 related, but distinct MLST genotypes, M4 and M4a-4f (Table 2).
MLST genotyping confirmed the existence of two genetically isolated groups in the global population of C. neoformans serotype A and a unique population in Botswana:
Genetic relationships among the 57 MLST genotypes have been visualized by the nonmetric MDS plot (Figure 1B). The same three groups defined by the AFLP genotypes also emerged from the MLST analyses. Isolates from group VNB are unique to Botswana; isolates from group VNI are found both in Botswana and globally; isolates from group VNII are isolated from the United States, Australia, and Uganda. Similarly, 2 of the 3 putative hybrid genotypes identified by the AFLP analysis (bt125 and bt131) also contained MLST alleles of both groups, suggesting that they may be products of recombination between the groups. However, the third hybrid genotype identified by the AFLP analysis (bt68) grouped well within group VNI and was not identified as a hybrid by the MLST genotyping (Figure 1B). As with AFLP, the same three groups were delineated by an alternative PCA ordination method (data not shown).
In addition, genetic relationships among all 102 isolates were estimated using the NJ method. (Figure 2A). The three major groups identified by the MDS and PCA analyses of the AFLP and MLST genotypes are clearly recognized and well supported by bootstrap values of 90, 77, and 100% for groups VNB, VNI, and VNII, respectively. In addition, group VNB consists of three clades, VNB-A and VNB-B, which were identified previously (Litvintseva et al. 2003), and VNB-C. The VNB-A clade is dominated by the isolates with MATα mating type, the VNB-B clade contains most of the isolates with the MATa allele, and VNB-C contains isolates with both mating types (Figure 2A). In addition, the VNB-C clade is unusual because it contains only the isolates with incongruent gene genealogies of several genes (see below). Group VNI consists of five shorter clades that correspond to six distinct AFLP genotypes: A1 and A3 (67% bootstrap support), A2 (99%), A3 (91%), A4 (91%), A5 (100%), and A10 (98%). With one exception, the Botswanan isolates in group VNI formed a distinct clade within the A1/A3 clade with bootstrap support of 64%. The exceptional isolate from Botswana (bt134) is embedded within the A5 clade of group VNI (Figure 2A, Table 1), which also includes isolates from Belgium, Italy, Japan, Malawi, and the United States. Additional analyses by maximum parsimony, UPGMA, and maximum-likelihood methods generated similar phylogenetic patterns (data not shown).
To better understand the evolutionary relationships among these populations, we compared the MLST gene sequences of 92 strains of serotype A, which had congruent gene genealogies (see below) with those of the recently sequenced JEC21 strain of C. neoformans serotype D (Loftus et al. 2005). Among the variable regions of the CAP59, IGS1, and URE1 loci, there was substantial polymorphism between the two serotypes. Large insertions/deletions were excluded from the alignment, which decreased the number of different strains that could be distinguished. Overall, 6478 nucleotides were aligned, 135 parsimoniously informative polymorphic sites were compared, and 27 genotypes were differentiated. Figure 2B depicts the strict consensus of the eight most parsimonious trees rooted with the JEC21 serotype D strain. The position of the root indicates that group VNII is closest to the most recent common ancestor and that groups VNB and VNI diverged more recently.
AMOVA analysis of AFLP and MLST genotypes reveals the genetic isolation of groups VNI, VNII, and VNB and the absence of geographic structure in the global population:
Phylogenetic methods and nonhierarchical ordination analyses (MDS and PCA) of both AFLP and MLST data delineated three major groups in the global population. As an independent assessment of the validity of these groups, we performed an AMOVA on both data sets. Variance components were calculated for the following comparisons: (i) on the basis of the phylogenetic structure depicted in Figures 1 and 2, we compared three groups (VNI, VNII, and VNB) and five populations (VNI, VNI-Botswana, VNII, VNB-A, and VNB-B), and (ii) on the basis of the geographic origins of the strains, we compared two groups (Botswana and everywhere else) and four populations (Botswana, elsewhere in Africa, Europe, and North America). Results of the AMOVA are shown in Table 5. For the groups based on the phylogeny, most of the AFLP allelic variation is due to variance among the groups (58.8%); the remaining AFLP variation is attributed to variance among populations (16.9%) and variance among the individuals within the population (24.3%). For the MLST genotypes, the majority of the variation is ascribed to variance within the populations (52.4%); however, the remaining variation, which is due to variance among the groups (23.8%) and among the populations (23.8%), is highly significant (P < 0.001), indicating extensive divergence among these groups and populations.
Conversely, when groups and populations are defined on the basis of geography, the majority of the variation among both AFLP and MLST alleles is attributable to variance among the individuals within the populations (66.2 and 78.2%, respectively) as well as variance among the groups (31 and 18.9%, respectively), which indicates that the isolates from Botswana are indeed unique and different from all the other isolates that were analyzed. In contrast, variation among the populations from other sub-Saharan Africa countries, Europe, and North America was much smaller (2.8% AFLP and 2.9% MLST), indicating little differences among these populations.
To further evaluate genetic divergence among the populations, Wright's fixation indexes (FST) were calculated for pairs of putative populations. FST has a theoretical minimum of 0 (indicating no genetic divergence) and a theoretical maximum of 1 (indicating fixation for alternative alleles in different populations); however, the index rarely reaches the maximum of 1, and an FST-value >0.15 denotes considerable differentiation.
For all of the five phylogenetically defined populations, the pairwise FST-values were above >0.25 (P < 0.001, Table 6), indicating significant genetic divergence among these populations. Conversely, when the populations were defined according to the geographical origins of the isolates, the population from Botswana was significantly divergent from the other samples from Africa, North America, and Europe (FST ≥ 0.18; P < 0.001, Table 7). However, there was no significant difference among the populations from non-Botswanan Africa, North America, and Europe. In both sets of FST calculations (Tables 6 and 7), the AFLP and MLST data yielded the same results.
Evidence of clonality in the global population and recombination in the Botswanan population:
The existence of identical MLST/AFLP genotypes on different continents (Figure 2A) suggests that the global population of C. neoformans var. grubii (serotype A) is predominately clonal. Previous data indicated that the isolates from Botswana exhibit evidence of both clonality and recombination (Litvintseva et al. 2003). Here we used MLST as well as AFLP genotypes to further investigate the possibility of recombination in the Botswanan sample. The index of association (IA) estimates linkage disequilibrium among the loci in the population (Maynard Smith et al. 1993). IA was calculated for the AFLP and MLST genotypes in different populations and groups. Linkage equilibrium has been detected among both AFLP and MLST loci in the Botswanan subgroups VNB-A, VNB-B, and VNB-C and in the Botswanan clade of VNI (Table 8). However, the null hypothesis of linkage equilibrium was rejected for the MLST loci in groups VNI, VNII, and VNB, as well as among the AFLP loci in groups VNI and VNII (Table 8, P < 0.01).
Nonrandom associations among the loci in the various groups were also evaluated by phylogenetic methods. When all of the strains were included in the analysis, 855 most parsimonious trees were generated (data not shown). The consistency indexes (CI) of those trees were low (CI = 0.43); however, the strict consensus of the trees was well resolved (data not shown). Moreover, the lengths of the most parsimonious trees (LMPT) of the entire sample of 102 isolates and of groups VNI, VNII, and VNB, as well as subgroups VNB-A and VNB-B, were significantly shorter then the lengths of the most parsimonious trees calculated for the randomized data, indicating significant linkage disequilibrium among the alleles in all of the groups (Table 8, P < 0.01). The only exception was the VNB-C subgroup, in which linkage equilibrium among the loci was also detected by the LMPT test (P = 0.52, Table 8).
We developed a phylogeny for each of the 12 genes in the MLST data set of 102 isolates, and these gene genealogies were analyzed for their congruence. Under strict clonality, the genealogies of multiple genes should be congruent (Taylor et al. 1999b; Burt et al. 2000). Nine Botswanan strains (bt125, b148, bt33, bt88, bt84, bt65, bt131, bt31, and bt109) and the MATa strain (JH125.91) from Tanzania were inconsistently placed in the 12 gene genealogies. For example, in the CAP10 gene genealogy, bt148, bt33, bt88, bt65, bt131, bt31, and bt109 cluster with group VNB strains from Botswana, whereas bt125 clusters with group VNI (bootstrap 76%, Figure 3A). Conversely, in the SOD1 gene genealogy, bt131 and bt125 cluster with group VNB, whereas bt148, bt33, bt88, bt65, and bt109 cluster with group VNI (bootstrap 96%, Figure 3B). Moreover, in the MP88 gene genealogy, both bt131 and bt125 cluster with group VNI, whereas the remaining six strains cluster with group VNB (bootstrap 72%, Figure 3C). Overall, phylogenetic incongruence among the 12 gene genealogies genes was statistically significant (ILD test, P < 0.01).
Confirmation of the existence of genetically isolated groups of serotype A:
Other reports have documented the existence of genetically distinct groups among isolates of serotype A, but intrinsic problems associated with interpreting and comparing fingerprinting data have precluded clarification of the relationships among the previously described groups.
Using multilocus enzyme electrophoresis (MLEE), Brandt et al. analyzed a large clinical population of serotype A in the United States and described two distinct groups, designated ET1 and ET2 (Brandt et al. 1995, 1996).
Using AFLP genotyping, Boekhout et al. examined a global collection of serotype A and identified two distinct clusters, termed genotypes 1 and 1A (Boekhout et al. 2001).
Meyer et al. used PCR fingerprints to identify two distinct molecular types within a global population of serotype A, designated VNI and VNII (Meyer et al. 2003).
Litvintseva et al. analyzed AFLP genotypes in the Botswanan population of serotype A and identified two groups, I and II (Litvintseva et al. 2003); they also investigated a large sample of clinical and environmental isolates from the United States and discerned two distinct subgroups, also designated I and II (Litvintseva et al. 2005).
To clarify this confusing nomenclature and compare isolates used in previous studies, this investigation included the reference strains of VNI and VNII (Meyer et al. 2003) and adopted the VNI–VNII nomenclature. The MLST/AFLP cluster that included the VNI typing strain was designated group VNI, and the cluster with the VNII reference strain was named group VNII (Figures 1 and 2). The relationships among these variously labeled genetic groups are shown in Table 9.
Although the population structure of C. neoformans has been studied for many years (Brandt et al. 1995; Mitchell and Perfect 1995; Casadevall and Perfect 1998; Boekhout et al. 2001; Meyer et al. 2003; Litvintseva et al. 2005), the DNA fingerprinting methods that have been used in the past precluded comparing genotypes developed in different laboratories. Therefore, the overall understanding of the global population structure of this important pathogen has been fragmentary. For example, unusual populations of C. neoformans var. grubii serotype A were discovered in Botswana (Litvintseva et al. 2003) and Brazil (Barreto de Oliveira et al. 2004), but their relationships to isolates from other countries are obscure.
The first attempt to apply multilocus sequence typing to the analysis of the population structure of C. neoformans was performed by Xu et al. (2000), who analyzed the sequences of four genes and determined the evolutionary relationships among 34 strains, representing all four serotypes of C. neoformans and C. gattii. However, due to the limited number of strains and genes analyzed, their study did not explicate the population structure of C. neoformans var. grubii (serotype A), the most clinically relevant variety of the fungus. Here we focused on isolates of serotype A and expanded the MLST approach, which enables unambiguous genotyping of isolates, eliminates the necessity of reference strains, simplifies sharing genotypic data among other laboratories, and permits other researchers to add new strains to further refine the analyses. We compared genotyping by MLST to the more commonly used AFLP method and found that MLST had greater discriminatory power. Overall, there was good correlation between the MLST and AFLP genotypes, and combining these methods allowed excellent discrimination among even closely related strains of serotype A.
To analyze the genetic relationships among the AFLP and MLST genotypes, we used three independent methods: AMOVA, ordination methods (MDS and PCA), and phylogenetic analyses, which are commonly used to detect evidence of population subdivision and differentiation (Hartl and Clark 1997; Burnett 2003). All three methods demonstrated that the global population of C. neoformans var. grubii comprised at least three genetically distinct groups, designated VNI, VNII, and VNB. Isolates of the VNB group were unique to Botswana; whereas strains of VNI and VNII groups were widespread (Figure 2A).
With the exception of the unusual population in Botswana, the global population of serotype A was dominated by isolates of the VNI and VNII groups and exhibited strong evidence of clonality (Maynard Smith et al. 1993; Taylor et al. 1999a,b): (i) identical MLST/AFLP genotypes were isolated from quite distant locations (Figure 2A, Table 2); (ii) statistically significant linkage disequilibrium was detected among the AFLP and MLST loci in both VNI and VNII groups (Table 8); (iii) the gene genealogies of both groups were congruent; (iv) AMOVA and FST-analyses detected little or no differences in the genotypic frequencies among the populations from North America, Europe, and Africa (excluding Botswana) (Tables 5–7⇑); and (v) in addition, geographic structure was not evident in the phylogeny of these groups, which is consistent with a predominantly clonal mode of reproduction.
In particular, six AFLP/MLST genotypes (A1/M1, A3/M1, A3/M3, A4/M4, A5/M5, and A7/M7) accounted for 53% of the global isolates excluding Botswana and Thailand (Tables 1 and 2; Figure 2A). [The unusual population in Botswana is discussed below, and the 29 clinical isolates from Thailand (Archibald et al. 1999) contained two clonally related genotypes, A10/M10 and A10/M10a, which were also found in Uganda and Malawi (Figure 2A)]. The prevalence of a few identical strains and overall low level of polymorphisms in the global population suggest an “epidemic” structure of the population with recent emergence and/or dispersal of these strains around the globe (Maynard Smith et al. 1993). Similar population structures have been described for pathogenic bacteria (Maynard Smith et al. 1993; Enright and Spratt 1999), protozoa (Tibayreng et al. 1990), and other fungi (Graser et al. 1996; Carbone et al. 1999; Couch and Kohn 2000; Morehouse et al. 2003; O'Donnell et al. 2004). However, in these cases, the spread of clonal microorganisms was attributed either to the migration of infected humans or to the international trade of horticultural and agricultural products (Graser et al. 1996; Carbone et al. 1999; Couch and Kohn 2000; Morehouse et al. 2003; O'Donnell et al. 2004). Wind (Hovmooller et al. 2002) and water (O'Donnell et al. 2004) may also contribute to the dispersal of clonal lineages. However, these explanations are unsatisfactory because cryptococcosis is not transmitted among humans or animals and isolates of serotype A are not known to produce airborne spores capable of dispersion over long distances (Casadevall and Perfect 1998).
This global clonality and low genetic diversity may be attributable to the well-documented association of isolates of serotype A with feral pigeons, which can serve as vectors of transmission (Casadevall and Perfect 1998; Litvintseva et al. 2005). Although the pigeons do not acquire cryptococcosis, most likely because C. neoformans cannot grow at the normal avian body temperature of 41°–42°, the yeast cells survive passage through the pigeon intestinal tract and may remain viable for up to 2 years in pigeon excreta, which contain urea and other utilizable substrates (Casadevall and Perfect 1998). Feral pigeons have populated the regions surrounding the Mediterranean basin for the last 2000 years and were introduced to many other areas during the European expansion within the last 400 years (Long 1981). Therefore, in recent centuries pigeons probably facilitated the global spread of serotype A. Our investigation supports this hypothesis, as strains from both groups VNI and VNII, as well as the six most prevalent genotypes, have been isolated from pigeon excreta and other environmental sources in the United States and Europe (Table 2, Figure 2A; Boekhout et al. 2001; Litvintseva et al. 2005). Moreover, with the exception of the A2/M2 genotype, which so far has been detected only in the environment (Table 2; Litvintseva et al. 2005), most of the genotypes were equally prevalent in clinical and in environmental samples.
Unlike the remaining global population, the population in Botswana is more diverse and appears to be genetically and geographically unique. We have AFLP genotyped >1000 haploid isolates of serotype A from different parts of the world, including the sub-Saharan countries of Tanzania, Uganda, Malawi, and the Democratic Republic of the Congo (Zaire), and isolates of group VNB were found only in Botswana. Moreover, with one exception (bt134, Figure 2A), genotypes of the Botswanan isolates within group VNI formed a single, relatively well supported clade (bootstrap 64%), and they were not found outside Botswana. Both VNB and VNI isolates from Botswana were capable of mating with other global isolates of the VNI group in the laboratory (Litvintseva et al. 2003), as well as with the JH8-1 strain from the VNII group (our unpublished data). The presence of apparent hybrid genotypes in the Botswanan population (Figure 3) indicates that VNB and VNI strains can mate and recombine in the environment. Most likely, isolation or selection of the unique population of serotype A in Botswana is attributed to the presence of geographical or ecological barriers that impede genetic exchange with other populations.
Without local environmental studies, the possibility of an unusual ecological niche for the VNB subpopulation is open to conjecture. Strains of the other groups of serotype A have been isolated from a variety of natural environments, including the excreta of several avian species (Casadevall and Perfect 1998; Litvintseva et al. 2005), decayed wood (Casadevall and Perfect 1998; Randhawa et al. 2003), trees (Meyer et al. 2003; Barreto de Oliveira et al. 2004; Gugnani et al. 2005), domestic dust (Swinne et al. 1989; Barreto de Oliveira et al. 2004), and apian habitats (Ergin et al. 2004). Few, if any, plants and animals are solely endemic to Botswana, and it is likely that VNB genotypes will be discovered in vicinal countries. In general, the sub-Saharan region supports a rich variety and abundance of plant, avian, mammalian, and insect species (Burger 2003), and any component(s) of this biota could conceivably enrich for and harbor C. neoformans. (Certainly, the highest global incidence of cryptococcosis occurs in sub-Saharan Africa.)
The Botswanan population of serotype A is characterized by a high proportion of fertile MATa isolates that have not been found in any other parts of the world (Table 8). However, the proportion of MATa strains in the VNB group was not consistent with frequent sexual recombination, Table 8). Moreover, the distribution of the MATa allele within the VNB group was uneven: the VNB-A clade was dominated by the isolates with the MATα mating type (Table 8, Figure 2A), whereas VNB-B was dominated by the isolates with the MATa mating type (Table 8, Figure 2A). The most likely explanation for such unequal distribution of the mating types within the population may be that VNB-A and VNB-B clades represent two clonal lineages that originated from strains with MATα and MATa mating types, respectively. A similar population structure was detected in the related species C. gattii in Australia (Halliday and Carter 2003; Campbell et al. 2005). This population of C. gattii contains equal proportions of isolates with MATa and MATα mating types coexisting in the same local environment; however, the phylogenetic and population genetic analyses revealed strong evidence for the isolation of the MATa and MATα clonal lineages (Halliday and Carter 2003). The third clade in the VNB group of C. neoformans, VNB-C, includes isolates with both mating types, all of which have incongruent genealogies and bear direct evidence of recombination (Figures 2 and 3). Furthermore, VNB-C was the only subgroup of the global sample, in which linkage equilibrium was detected with both the IA and LMPT tests (Table 8), and therefore it may represent an actively recombining component of the Botswanan population. However, the ecological or geographical rationale for applying these statistics for the VNB-C group alone requires additional investigation (Maynard Smith et al. 1993; Litvintseva et al. 2003). In addition to the VNB-C isolates, several strains in the Botswanan population (e.g., bt125, bt131, and bt148) contain hybrid genotypes that originated from recombination between VNB and VNI groups (Figures 2A and 3). Environmental sampling in Botswana and neighboring regions will be necessary to investigate the possibility and extent of genetic recombination in the population.
Overall, our results indicate that the population of serotype A in Botswana exhibits evidence of both clonality and recombination, whereas the remaining global population is predominantly clonal. Population genetics studies of other medically important fungi commonly manifest evidence of both clonal propagation and recombination (Taylor et al. 1999a; O'Donnell et al. 2004). Both modes of reproduction confer evolutionary advantages and as well as costs to fitness. For example, it was recently demonstrated that interaction between the mating partners in C. neoformans serotype D incurred a 10% reduction in vegetative fitness, which may explain the overall preference for clonal propagation in the population (Xu 2005). Conversely, in Saccharomyces cerevisiae sexual reproduction significantly increased the rate of adaptation to new environments (Goddard et al. 2005), which may explain why a small amount of recombination is maintained in fungal populations, despite the apparent compromise in fitness.
This study applied two independent genotyping methods, MLST and AFLP, and an array of population genetics tests to investigate a sample of global isolates of C. neoformans var. grubii. Our investigation addressed several relevant issues regarding the diversity, population structure, and mode of reproduction of this pathogen. The results indicated the presence of three genetically isolated groups in the global population. Groups VNI and VNII are widespread, dominated by isolates with the MATα mating-type allele, and predominantly clonal; whereas, isolates of group VNB are unique to Botswana, contain a significant proportion of fertile strains with the MATa mating type, exhibit greater genotypic diversity than groups VNI or VNII, and manifest evidence of clonality and recombination. To our knowledge this is the first report of an endemic, genetically distinct population of C. neoformans associated with a particular geographical locality, and this finding may have attendant clinical implications. However, it is probable that the global population of C. neoformans var. grubii (serotype A) is not restricted to these three genetic groups. We predict that other genetically distinct endemic groups will be identified in other countries and continents. The MLST genotyping system developed here can be used by multiple laboratories for a global surveillance of strains to identify additional clinical and environmental isolates and assign them to characterized or new populations of serotype A.
Cultures were generously provided by Wiley A. Schell (Medical Mycological Research Laboratory, Duke University Medical Center), L. Barth Reller (Department of Pathology, Duke University Medical Center), Wieland Meyer (Center for Infectious Diseases and Microbiology, Westmead Hospital, Sydney, Australia), and Teun Boekhout (Centraalbureau voor Schimmelcultures, Utrecht, The Netherlands). We thank Timothy Y. James and Robert E. Marra for helpful discussions, and we are grateful to Lisa Bukovnik for DNA sequencing. This investigation was supported by Public Health Service grants AI25783 and AI44975 from the National Institutes of Health.
Communicating editor: P. J. Pukkila
- Received June 8, 2005.
- Accepted November 2, 2005.
- Copyright © 2006 by the Genetics Society of America