Cis- and Trans-chromosomal Interactions Define Pericentric Boundaries in the Absence of Conventional Heterochromatin

The molecular cues for CENPA positioning in epigenetically regulated centromeres is elusive. Candida albicans has small regional non-repetitive centromeres that do not harbor conventional heterochromatin. Deletion of native centromere leads to activation of a neocentromere... The diploid budding yeast Candida albicans harbors unique CENPA-rich 3- to 5-kb regions that form the centromere (CEN) core on each of its eight chromosomes. The epigenetic nature of these CENs does not permit the stabilization of a functional kinetochore on an exogenously introduced CEN plasmid. The flexible nature of such centromeric chromatin is exemplified by the reversible silencing of a transgene upon its integration into the CENPA-bound region. The lack of a conventional heterochromatin machinery and the absence of defined boundaries of CENPA chromatin makes the process of CEN specification in this organism elusive. Additionally, upon native CEN deletion, C. albicans can efficiently activate neocentromeres proximal to the native CEN locus, hinting at the importance of CEN-proximal regions. In this study, we examine this CEN-proximity effect and identify factors for CEN specification in C. albicans. We exploit a counterselection assay to isolate cells that can silence a transgene when integrated into the CEN-flanking regions. We show that the frequency of reversible silencing of the transgene decreases from the central core of CEN7 to its peripheral regions. Using publicly available C. albicans high-throughput chromosome conformation capture data, we identify a 25-kb region centering on the CENPA-bound core that acts as CEN-flanking compact chromatin (CFCC). Cis- and trans-chromosomal interactions associated with the CFCC spatially segregates it from bulk chromatin. We further show that neocentromere activation on chromosome 7 occurs within this specified region. Hence, this study identifies a specialized CEN-proximal domain that specifies and restricts the centromeric activity to a unique region.

entirely depend on conserved DNA elements for kinetochore binding, and therefore have been proposed as excellent models to study epigenetically regulated metazoan CENs.
The centromeric chromatin is different from bulk chromatin and is epigenetically specified in most regional CENs (Sullivan and Karpen 2004). While the core CEN and adjacent pericentric regions are poorly transcribed, pervasive levels of transcription at fungal CENs are known to influence CEN activity (Choi et al. 2011;Ohkuni and Kitagawa 2011;Ling and Yuen 2019). Even though CENPA limits its localization on a chromosome, the functional region required for chromosome segregation is much larger and involves pericentric regions. This is largely exemplified in S. pombe minichromosomes, which are stabilized in the presence of an outer pericentric repeat in addition to the central core (Baum et al. 1994). Centric and pericentric chromatin differ notably in fungal systems. In S. pombe, H3 nucleosomes are nearly absent from the CENPA-rich central core (Thakur et al. 2015). RNA interference (RNAi)directed heterochromatin assembly at the outer repeat mediates targeting of CENPA by Clr4-mediated dimethylation at H3K9 (Volpe et al. 2002;Folco et al. 2008;Allshire and Ekwall 2015). These features are shared in C. neoformans, where the extensively methylated CEN DNA is enriched with H3K9me2 and maintained with the help of the RNAi machinery (Yadav et al. 2018). In contrast, the CENPA-bound regions in N. crassa have heterochromatic properties containing H3K9me3 nucleosomes and 5-methylcytosine (5mC) (Smith et al. 2011). Their pericentric regions are 5-to 20-kb long and enriched in H3K4me3 and 5mC (Smith et al. 2011;Friedman and Freitag 2017). All known variants of fungal CEN chromatin in regional CENs are more similar to heterochromatin than euchromatin, largely owing to the presence of silencing histone marks and components of RNAi machinery (Friedman and Freitag 2017). Intriguingly, the pericentric boundaries are often not well defined in organisms having short regional CENs, as features like pericentric repeats, associated histone marks, or RNAi machinery are either lacking or cryptic.
One of the hallmarks of the epigenetic control at CENs is the reversible silencing of a transgene positioned within the CENPA-binding region. Transgenes inserted at the central core and outer repeats of the S. pombe CENs undergo transcriptional silencing that are clonally inherited. Compared to the outer repeats that are highly heterochromatinized, transgene silencing within the CENPA-bound central core is relatively unstable, resulting in variegated expression (Allshire et al. 1994;Karpen and Allshire 1997;Allshire and Ekwall 2015). Hence, transgene silencing is an effective screen employed to study centric and pericentric heterochromatin properties. The epigenetic regulation of CENs has also been demonstrated by neocentromere formation. First observed in humans to rescue acentric fragments (Voullaire et al. 1993), neocentromeres are activated at ectopic loci when the native CEN is inactivated. Therefore, neocentromeres are an aid to study de novo CEN formation mechanisms. Neocentromeres are formed at CEN-proximal loci in Drosophila (Maggert and Karpen 2001) and chicken cells (Shang et al. 2013). The assembly of ectopic CENPA as a "CENPA-rich zone" or "CENPA cloud" surrounding the endogenous CEN and proximity of neocentromere hotspots to the native CEN in these organisms indicates that CENPA is peppered on CEN-adjacent loci and can get rapidly incorporated into the CEN upon eviction (Fukagawa and Earnshaw 2014). Also, the site of neocentromere activation is found to be incompatible with transcription (Scott and Sullivan 2014). These epigenetic mechanisms ensure stable propagation of active CENs across generations.
CENs cluster next to the spindle pole bodies throughout the cell cycle in budding yeast species including Saccharomyces cerevisiae (Jin et al. 2000;Haase et al. 2013) and several species of Candida (Sanyal and Carbon 2002;Padmanabhan et al. 2008;Burrack et al. 2016;Chatterjee et al. 2016). Clustered centromeric regions were shown to be in physical proximity by a genome-wide chromosomal interaction study in S. cerevisiae, giving rise to physical interactions between different CENs (Duan et al. 2010). Highthroughput chromosome conformation capture (Hi-C) and related studies in S. cerevisiae have revealed chromosome substructures in which domains with similar contact probabilities have higher interactions than the ones that interact due to random diffusion (Tjong et al. 2012;Eser et al. 2017). Recently, chromosome conformation capture-on-chip (4C) analysis in vertebrates revealed that clustered CENs are present in a compact chromatin environment (Nishimura et al. 2018). The neocentromeres in these cells were commonly associated with specific heterochromatin-rich regions in the three-dimensional (3D) nuclear space. Hence, the 3D architecture of the chromosome, its scaffolds, and its associated chromatin within the nucleus provide the spatial cues required to specify CEN location.
Nonrepetitive CENs serve as excellent models to study characterization of centromeric chromatin. In C. albicans, every CEN harbors a unique CEN DNA sequence (Sanyal et al. 2004), each of which cannot stabilize a CEN plasmid (Sanyal et al. 2004;Baum et al. 2006). The activation of neocentromeres at hotspots proximal to the native CEN location (Thakur and Sanyal 2013) and presence of CEN-proximal replication origins (Koren et al. 2010;Mitra et al. 2014) indicate the prominent role of CEN-proximal or pericentric regions for CEN function. Moreover, there is no functional evidence for the existence of a pericentric boundary element to restrict CENPA in C. albicans, as seen in the case of CENs in S. pombe (Karpen and Allshire 1997;Allshire and Ekwall 2015). Unlike S. pombe, the genome of C. albicans does not encode an HP1/Swi6-like protein, an H3K9 methyltransferase like Clr4, and components of a fully functional RNAi machinery (Freire-Benéitez et al. 2016). There is no evidence of DNA methylation at the CEN DNA in C. albicans (Baum et al. 2006;Mishra et al. 2011). The reversible silencing of the expression of a marker gene, URA3, captured by 5-Fluoroorotic acid (5-FOA) counterselection, has been observed upon its integration into the CENPA-binding region of the CEN in C. albicans, giving it a transcriptionally flexible status (Thakur and Sanyal 2013). These features make it difficult to determine the exact molecular cues for positioning CENPA to form centromeric chromatin in this organism.
In the present study, we attempt to identify factors that determine CEN formation within a confined territory of the 3D nuclear space. We do so by combining a transgene silencing assay on chromosome 7 (Chr7) with analysis of published Hi-C data to decipher pericentromeric chromatin boundaries in C. albicans and map CEN-flanking compact chromatin (CFCC). This CFCC acts as the pericentromere, spatially segregating CENs from bulk chromatin and favoring neocentromere formation.

Construction of URA3 integration strains
To construct the individual URA3 integration cassettes, long primer pairs were designed (Supplemental Material, Table S5). Briefly, 70-bp regions both upstream and downstream of the site of integration were incorporated in the primers as overhangs. For all the integrations (except the L3 and R2 loci), the 1.4-kb URA3 gene was amplified from the plasmid pUC19-URA3 (Mitra et al. 2014) using the aforementioned primers. The integration corresponding to L3 was constructed using an MluI-digested plasmid pFA-URA3-I-SceI-TS-Orf 19.6524/25. The integration corresponding to R2 was constructed using an MluI-digested plasmid pFA-URA3-I-SceI-TS-Orf 19.6520/22. The PCR and digestion products were used to independently transform J200 (Thakur and Sanyal 2013). The transformants were selected on complete medium (CM) lacking uridine (CM-Uri) and confirmed by PCR. For the CEN7 deletion experiments, integration cassettes corresponding to L4 and R4 loci were transformed in 8675 (Joglekar et al. 2008) and confirmed by PCR. Three independent transformants of each integration type were used for the assays. All the distances of individual URA3 insertions are indicated with respect to the midpoint of CEN7 which has been taken as Ca21Chr7_427262.

Construction of the CEN7-deleted strains (CaCEN7)
To delete one copy of CEN7, a cassette was constructed as follows. A 1.4-kb fragment containing a 66-bp upstream sequence (Ca21Chr7 424413-424472) and a 70-bp downstream sequence (Ca21Chr7 428994-429053) of CEN7 and a marker gene (CaHIS1) were amplified from pBS-HIS using specific primers (Table S5). The PCR product was used to transform the 5-FOA-resistant isolates from the strains LSK443 and LSK456 and their corresponding 5-FOA-sensitive isolates. The transformants were selected on CM lacking histidine (CM-His) and screened by PCR. Transformants in the cis-orientation, where URA3 and HIS1 are present on the same homolog, were screened by Southern hybridization (Southern 1975).

Media and growth conditions
All strains of C. albicans where URA3 was integrated into Chr7 and Chr5 were propagated in YPD (1% yeast extract, 2% peptone, 2% dextrose) with uridine (YPDU), unless otherwise specified. All transformation experiments were done in YPDU using standard methods (Mitra et al. 2014). The auxotrophs were selected on appropriate selection media, as mentioned previously. For the 5-FOA sensitivity assays, CM with 2% agar was supplemented with 1 mg/ml 5-FOA (catalogue no. F5013; Sigma). Strains with neocentromeres were grown in YPDU.

Silencing assay
Each of the URA3 integrants was grown in YPDU overnight. The cells were spun down, washed, and 1 million cells from three independent transformants of each kind of integration were plated on CM+5-FOA. The plates were incubated at 30°f or 72 hr. A total of 100 colonies from each plate were patched on CM-Uri and YPDU. These were simultaneously patched on CM-His and CM-Arg plates to detect events such as loss of the marker gene URA3 or whole chromosome loss. The colonies showing growth in CM-Uri were counted and the percentage of reversible silencing was determined. For the CEN7::URA3 strains, we plated 150 colonies on 5-FOA and analyzed 70 of them.

Chromatin immunoprecipitation (ChIP) and quantitative PCR (qPCR) analysis
A single colony of C. albicans was inoculated into 50 ml YPDU and grown until log phase. Crosslinking was done for 15 min (for CENPA) or 30 min (for Mtw1) using formaldehyde to a final concentration of 1% and cells were quenched using 0.135 mM glycine for 5 min at room temperature. Quenched cells were incubated in a reducing environment in the presence of 9.5 ml distilled water and 0.5 ml b-mercaptoethanol (catalogue no. MB041; HiMedia). The rest of the protocol from Yadav et al. (2018) was then followed. Finally, the DNA pellet was resuspended in 20 ml MilliQ water. All three samples (I, +, 2) were subjected to PCR reactions. The input and immunoprecipitation (IP) DNA were diluted appropriately and quantitative PCR (qPCR) reactions were set up using primers listed in Table S5. CENPA/Mtw1 enrichment was determined by the percentage input method using the formula: 100*2^(adjusted Ct input 2 adjusted Ct IP). Here, the adjusted Ct is the dilution factor (log 2 of dilution factor) subtracted from the Ct value of the input or IP. Three technical replicates were taken for qPCR analysis and SEM was calculated. To determine statistical significance of test regions with the noncentromeric control LEU2, two-way ANOVA was used. Multiple comparisons were performed using Bonferonni post-tests with the following P-values: *** P , 0.001, ** P , 0.01, NS P . 0.05. Final values for ChIP-qPCR were plotted using GraphPad Prism 5.0.

ChIP-sequencing analysis
For the CENPA ChIP-sequencing (ChIP-seq), immunoprecipitated DNA and the corresponding DNA from whole-cell extracts from strains LSK450 and LSK465 were quantified using Qubit before proceeding for library preparation. An amount of 5 ng ChIP or total DNA was used to prepare sequencing libraries using NEBNext Ultra DNA Library Prep Kit for Illumina (New England Biolabs, Beverly, MA). The library quality and quantity were checked using Qubit HS DNA Assay Kits (Thermo Fisher Scientific, Waltham, MA) and Bioanalyzer High Sensitivity DNA Analysis kits (Agilent Technologies, Santa Clara, CA), respectively. The libraries that passed quality control were sequenced on Illumina HiSeq 2500 (Illumina, San Diego, CA). The HiSeq Rapid Cluster Kit and SBS Kit v2 were used to generate 50-bp paired end reads. The reads were independently aligned onto the C. albicans SC5314 reference genome (v. 21) and a genome with an altered version of Chr7 using the bowtie2 (v. 2.3.2) aligner. These processed BAM files were processed further using MACS2 for identification of peaks (Zhang et al. 2008). These peaks were annotated with the C. albicans SC5314 reference and altered assembly annotation files. Visualization of the aligned reads (BAM files) on the reference genome was performed using Integrative Genome Viewer (IGV; https://software.broadinstitute.org/ software/igv/).

Hi-C analysis
For the generation of the Hi-C contact probability matrix, C. albicans Hi-C data were analyzed using the hiclib package (http://mirnylab.bitbucket.org/hiclib/) (Imakaev et al. 2012). First, 2 3 80-bp paired end reads were iteratively aligned to the C. albicans genome assembly 21 (Ca21) using Bowtie2 (Langmead and Salzberg 2012) with the -verysensitive option. The alignment started from first 20 bases from 59 end, with an increment of 5 bases in subsequent iterations. Aligned read pairs were then assigned to Sau3AI restriction fragments. The fragment filtering steps subsequently removed self-circles, dangling ends, and PCR duplicates, and all the unique valid pairs were used for the generation of the interaction matrix (with bin size of 2 kb or otherwise specified). Bin filtering steps included removal of bins with ,50% sequence information in the genome assembly and removal of 1% bins with the lowest summation. Diagonal bins were excluded from further downstream analysis. Iterative bin bias correction was then performed on the genome-wide interaction matrix. The contact probability matrix (C ij ) was generated from the normalized interaction matrix (I ij ) where each value C ij (representing probability of contacts between bin i and bin j) was calculated by the formula provided below: where n represents the total number of bins in the matrix.
For the plotting of trans-interactions, the distribution of all trans-contact probabilities (excluding 0 values) was plotted from interchromosomal regions of the genome-wide matrix. The mean trans-contact probability between all eight CENs was also calculated. Mann-Whitney U tests were then used to statistically compare interactions between CENs and transinteractions of bulk chromatin.
For the plotting of distance-dependent contact probability curves, all cis-contact probabilities (excluding zero values) were taken from the genome-wide matrix as well as pericentric or control regions. The pericentric region was defined as bins containing a CEN plus 10 kb each of upstream and downstream sequence. For every chromosome, a noncentromeric control region was chosen, which had the same size as the pericentric region and was equidistant from the CEN. The mean contact probability between pairs of loci separated by a given genomic distance was calculated for each region (pericentric, control, and genome wide). Mann-Whitney U tests were then performed to estimate if the distributions of contact probabilities at a given distance were significantly different between regions.
Similarly, the distribution of distance-dependent contact probabilities at each distance in the pericentric region was generated and Mann-Whitney U tests were conducted to estimate if the distributions of contact probabilities between two adjacent genomic distances were significantly different.
For the plotting of the 3C profile, the contact probabilities between CEN and cis-regions were plotted using a single row containing the anchor (centromeric) bin from the chromosome-wide matrix.

Data availability
The sequencing data used in the study have been submitted to the National Center for Biotechnology Information (NCBI) under the BioProject accession number PRJNA477284. The Hi-C data were downloaded from the NCBI BioProject (accession: PRJNA308106). Strains and plasmids are available on request. Supplemental material available at FigShare: https://doi.org/10.25386/genetics.8197388.

Results
Frequency of reversible silencing of a transgene decreases from the CENPA-bound core CEN region to its periphery in C. albicans The transgene URA3 gets reversibly silenced when it is integrated into the CENPA-bound central core of C. albicans (Thakur and Sanyal 2013). 5-FOA is a toxin that kills cells that express URA3 (Boeke et al. 1987). The cells that reversibly silence URA3 grow both on CM-Uri and CM+5-FOA and switch between an epigenetically bistable on and off expression state and are termed as 5-FOA resistant. The Ura-positive cells that do not show reversible silencing remain 5-FOA sensitive. We used this principle to examine the expression profile of URA3 at CEN-proximal regions in C. albicans.
We inserted the 1.4-kb URA3 gene into each of the 10 different CEN7-proximal loci independently in the strain J200, which has differentially marked homologs of Chr7 (Sanyal et al. 2004) ( Figure 1A and Table S1). We also integrated URA3 into a CEN7-distal (far-CEN7) locus and a CEN5-proximal locus. We plated 1 million cells of each URA3 integrant type on CM+5-FOA plates and obtained 100 5-FOA-resistant colonies (see Materials and Methods). These were replica plated on CM-Uri (Figure 1B). We specifically scored for colonies that grew both on CM-Uri and CM+5-FOA plates as those cells indicated reversible silencing of URA3 (Table 1). We also monitored the frequency of chromosome loss/gene conversion events of URA3 in these strains by examining the simultaneous loss of two markers on Chr7, ARG4 and URA3 or HIS1 and URA3 ( Figure 1B and Table S2). This was done to ensure that the 5-FOA-resistant colonies obtained from the assay retained the URA3 gene along with HIS1 and ARG4. The chromosome loss assay using two unlinked markers indicated that all the URA3 integrants exhibited loss rates comparable to the wild-type frequencies. With the exception of the L5 and far-CEN7 URA3 integrants, we could obtain 5-FOA-resistant colonies in all other integrants, suggesting silencing of URA3 at the corresponding locus.
We observed a steep decline in the percentage of colonies showing reversible silencing of URA3 (the ratio of the number of 5-FOA-resistant colonies that grow on CM-Uri and the total number of 5-FOA-resistant colonies analyzed) from the CEN7 core to its periphery ( Figure 1C). It must be noted here that, in every step of the plating assays, we proceeded with only those 5-FOA-resistant colonies that grew well on CM-Uri and CM+5-FOA. This suggests that URA3 was not mutated or inactivated at any point, which otherwise would have yielded a lawn of colonies on CM+5-FOA and a complete absence of growth on CM-Uri. These sets of experiments strongly indicate that transcriptional silencing of a reporter gene is observed outside the central core of CEN7 but is confined to a defined region. In addition, the frequency of reversible silencing of the transgene is a function of the radial distance from the CEN, due to fewer numbers of reversibly silenced colonies being obtained from the peripheral insertions.

CENPA-bound core CEN and the pericentromeres encompass a 25-kb CFCC domain
CENs in C. albicans are clustered to form a CENPA-rich zone (Thakur and Sanyal 2013). Consistent with this, the interchromosomal Hi-C heatmaps of wild-type C. albicans are interspersed with conspicuous punctate areas which signify physical contacts between CENs (Burrack et al. 2016). Using the same data set, we investigated the distribution of all nonzero contact probabilities from the interchromosomal (trans) contact matrix. We found that the mean interactions between CENs of different chromosomes were significantly higher relative to the mean trans-interactions of bulk chromatin (P = 3.47 3 10 290 ; Mann-Whitney U test) (Figure 2A). To examine the intrachromosomal (cis) interactions in pericentric regions, we generated distance-dependent contact probability curves by averaging pairwise interaction data at different linear genomic distances. At any given distance, the mean cis-contact probability of loci in pericentric regions was significantly higher than the mean cis-interaction of bulk chromatin ( Figure 2B). Consistently, noncentromeric control regions showed similar distance-dependent contact probabilities as all cis-interactions but they were significantly lower than those in the pericentric regions (except at a 24-kb distance, possibly due to small sample size). This observation was corroborated by analysis of subtraction matrices (pericentric region 2 a randomly-selected control region on Chr7) ( Figure S1). Hence, the core CENs of C. albicans strongly interact with a CEN-proximal region in cis, forming a compact chromatin environment that is distinct from bulk chromatin.
We then sought to find the boundaries of the pericentric regions within which loci interact with each other at high frequencies. It is well known that the contact probability shows an inverse relationship with an increase in the genomic distance (Dekker et al. 2002). When we plotted the distribution of contact probabilities in pericentric regions at different genomic distances, we found that this distribution is significantly different at each increment until it reaches a 10-kb distance from the core CEN region ( Figure 2C). From this point, distribution of contact probabilities remains at a low level (mostly without any statistically significant difference) irrespective of any increase in the genomic distance ( Figure  2C). Thus, we honed in on a pericentric region centering on the core CEN that exhibits high cis-contacts with any locus within 10 kb of sequence, flanking it upstream or downstream. In the case of Chr7, the 3C profile of CEN7 indicated a 25-kb region centering on CEN7 that has the compact chromatin feature ( Figure 2D). A similar observation was noted for CEN2 ( Figure 2E). The clear trend of an exponential decay in reversible silencing of URA3 correlated with the decay of contact probabilities between CEN7 and its neighboring cisregion. Hence, using a combination of the transgene silencing assay and Hi-C interaction analysis, we define an 25-kb region centering on the CENPA-bound core CEN region as the pericentromeres displaying the CFCC domain in C. albicans.

Neocentromeres in C. albicans are activated within the pericentric boundaries
Pericentromeres of Chr7 and Chr5 house genomic loci-like neocentromere hotspots and DNA replication origins (Thakur and Sanyal 2013;Mitra et al. 2014). In C. albicans, neocentromeres are shown to be activated primarily at CEN-proximal loci, irrespective of the length of the CEN DNA deleted (Thakur and Sanyal 2013). There are four neocentromere hotspots mapped on Chr7 so far: nCEN7-I, nCEN7-II, nCEN7-III, and nCEN7-IV (Thakur and Sanyal 2013). These regions do not share any DNA sequence similarity, leaving proximity to native CEN as the only known neocentromere determinant so far.
We posed the question as to why these hotspots are the favored regions for neocentromere activation. To address this, we repeated the transgene silencing assay by integrating URA3 at the R4 and L4 loci in the strain 8675 (CSE4-GFP-CSE4/CSE4) (Joglekar et al. 2008) and obtained 5-FOAresistant colonies as mentioned previously. In the same strains, we deleted the native CEN7 sequence (4.5-kb CENPA-rich region) using HIS1 to obtain LSK443 (L4/L4::URA3) and LSK456 (R4/R4::URA3) (Table S2). We designed a Southern hybridization (Southern 1975) strategy to screen for colonies where URA3 and HIS1 were present on the same homolog of Chr7 ( Figure S2). We wanted to examine the site of kinetochore assembly in these CEN7-deletion transformants. ChIP-qPCR analysis using anti-GFP antibodies showed that CENPA assembled at URA3 and neighboring regions in the CEN7deleted strains (Figure 3, A and B, top; Figure S3, A and B, top). We also examined the localization of an independent kinetochore protein, the Mis12 homolog in C. albicans, Mtw1 (Roy et al. 2011). ChIP-qPCR analysis using anti-Protein A antibodies revealed an overlapping binding pattern of Mtw1 with CENPA ( Figure 3B, bottom; Figure S3B, bottom). We performed CENPA ChIP-qPCR analysis in the corresponding 5-FOA-sensitive derivatives (which expressed URA3) and found that neocentromeres were activated at the hotspot nCEN7-II, instead of at the URA3 locus ( Figure  S3D). CENPA ChIP-seq in the strains LSK450 and LSK465 revealed two new hotspots, URA3nCEN7-I ( Figure S3C) and URA3nCEN7-II ( Figure 3C), respectively, on Chr7 (Table S3). Hence, we identified two new neocentromeres on Chr7 in this organism when a region is kept transcriptionally less permissive. These experiments strongly suggest that strains with the same genotype but varying expression levels of a transgene at pericentromeres can activate neocentromeres at different loci. This activation is restricted to the 25-kb pericentric CFCC that we identified in this study. Within this CFCC, a transcription desert site can be a potential neocentromere.

Discussion
In this study, we map a 25-kb region spanning the CENPAbound CEN core and its flanking regions on Chr7 in C. albicans, where the phenomenon of reversible silencing of a transgene, URA3, could be observed. We also demonstrate that this region forms a CFCC by stronger cis-interactions with neighboring sequences. In addition, trans-interactions among centromeric sequences also help cluster the CENs to provide a 3D nuclear space that we refer to as the CENPA-rich zone, possibly to facilitate epigenetic inheritance of CENPA chromatin. This is further evidenced by following the patterns of neocentromere activation on Chr7 in this study. Using CENPA ChIP-seq analysis, we previously proposed the presence of the CENPA-rich zone around the clustered CENs (Thakur and Sanyal 2013), which we revisited in this   :1926000-1928000; black bar) showing contact probabilities (red dots) between CEN2 and its neighboring bins on Chr2 (Chr2:1910000-1950000). * P , 0.05 (Mann-Whitney U test). study. We had proposed that the local concentration of CENPA at and around the CENs is higher than in the rest of the genome. We previously demonstrated that preexisting CENPA molecules are required for epigenetic inheritance of CEN function in C. albicans (Baum et al. 2006). In this study, we hypothesize that the miniscule levels of CENPA at the pericentromeres, which may be undetectable by less-sensitive methods like ChIP-seq, is important to activate a neocentromere at CEN-proximal regions in the absence of the native CEN. A previous attempt to characterize the pericentromeres in C. albicans claimed that a pericentric insertion of URA3 imposes a weak transcriptional repression (Freire-Benéitez et al. 2016). These assays were based on growth phenotypes and qRT-PCR analysis, making them less sensitive. The silencing assay that we have employed in this study is a way to score for such rare events when the CEN can relocate to an ectopic locus. It enabled us to isolate and amplify a clonally inherited population of cells that can switch the transcriptional status of a transgene. It has previously been demonstrated that when URA3 is integrated into the CENPA-bound central core of CEN7, a fivefold decrease in URA3 transcript levels was observed on growth in 5-FOA as compared to CM-Uri (Thakur and Sanyal 2013). Therefore, we can correlate the URA3 transcript levels to CENPA binding at URA3 in the pericentric insertions, although it is relatively unknown whether CENPA can silence transcription or whether a transcriptionally inert region stabilizes CENPA at the CENs in C. albicans. The reversible silencing seen at the S. pombe central core is because of its flexible CENPA domain (Allshire et al. 1994;Karpen and Allshire 1997;Allshire and Ekwall 2015). Unlike S. pombe, in our study, we define a pericentromere that is more transcriptionally permissible than the CENPA-rich core region in C. albicans.
The acquisition of centromeric properties on acentric DNA fragments have been studied extensively in metazoan CENs (Maggert and Karpen 2001). DNA fragments juxtaposed to Relative enrichment values of CENPA and Mtw1 indicate that the neocentromere formed on the altered homolog (URA3nCEN7-II) was mapped to a region surrounding the integration locus (Ca21Chr7 435078-440387); error bars indicate SEM. ns P . 0.05, ** P , 0.01, *** P , 0.001. (C) ChIP-seq using anti-GFP (CENPA) antibodies in the strain LSK465 reveals a single peak on all chromosomes, except Chr7 which shows two closely spaced peaks (top). Chr7 shows a combination of two peaks, the one at CEN7 (left) is of the unaltered homolog, the one at URA3nCEN7-II (right) is of the altered homolog (bottom). A 50-kb region harboring CEN7 depicts the track height (using IGV) on the y-axis and coordinates on the x-axis.
an active CEN gives rise to a neocentromere, the activity of which was found to be stable when the native CEN was removed (Maggert and Karpen 2001). This proximity effect of the endogenous CEN supports the spreading of CEN activity and identity, which helps in the epigenetic inheritance of CEN chromatin. In C. albicans, a previous study reported the deletion of the endogenous CEN5 with URA3 yielding two distinct classes of transformants forming neocentromeres: the proximal neocentromere and the distal neocentromere (Ketel et al. 2009). The proximal neocentromere harbored CENPA at URA3, resulting in silencing of its expression. Additionally, a Hi-C analysis in C. albicans (Burrack et al. 2016) revealed that neocentromeres on Chr5 cluster close to the endogenous CEN locus, implying that formation of a neocentromere leads to reorganization of the 3D architecture of the nucleus so that different chromosomal loci closely contact regions on other chromosomes. However, in the present study and in a previous study from our group (Thakur and Sanyal 2013), we could primarily detect proximal neocentromeres when CEN7, CEN5, and CEN1 were deleted. Hence, proximity to the endogenous location is an important criterion for neocentromere activation. We claim that this proximity effect is enclosed within a 25-kb CFCC on Chr7 because of the closely interacting CENPA-occupied chromatin which we identify in the study.
An impending question in this direction is what restricts CENPA chromatin to a 3-to 5-kb unique DNA sequence on every chromosome in C. albicans? There must be a genetic or epigenetic element that restricts its localization. In S. pombe, a transfer RNA boundary element prevents CENPA from spreading to adjoining euchromatic sites (Scott et al. 2006). In Drosophila melanogaster, this function is performed by the flanking heterochromatin and repetitive DNA elements (Maggert and Karpen 2001). In the absence of obvious DNA sequence cues and a canonical heterochromatin machinery, we propose that the CFCC defined in this study encloses the CEN activity and hotspots for neocentromere activation. Additionally, the site of neocentromere formation, which remains fairly elusive in this organism, can now be explained in the context of an atypical pericentric region within which flexible CENPA positioning is permitted.
The lack of a conventional heterochromatin machinery has been observed in S. cerevisiae (Drinnenberg et al. 2009), where pericentric cohesion is maintained by the presence of topological adjusters like cohesin, condensin, and topoisomerase II (Bloom 2014). Even the regional CENs in C. lusitaniae do not harbor any flanking heterochromatin, show a reduced rate of transgene silencing, and have methylation marks at H3K79 and H3R2 at the central core (Kapoor et al. 2015). It would be intriguing to examine cohesin localization at the pericentromeres in C. albicans. Transgene silencing assays and Hi-C analyses of all chromosomes will help establish the universality of the results obtained for Chr7 in this study. Additionally, Hi-C analysis of the strains forming neocentromeres obtained from the reversibly silenced colonies will also reveal if the cis-and trans-chromosomal interactions are conserved in C. albicans.