Sequence-level population simulations over large genomic regions
Clive J Hoggart 1*, Marc Chadeau 1, Taane G Clark 2, Riccardo Lampariello 3, Maria De Iorio 4, John C Whittaker 5 and David J Balding 4
1 Imperial College London
2 Wellcome Trust Centre for Human Genetics
3 Serono International S.A.
4 Imperial College
5 London School of Hygiene & Tropical Medicine
* To whom correspondence should be addressed. E-mail: c.hoggart{at}ic.ac.uk.
Submitted on December 1, 2006
Revised on January 10, 2007
Accepted on 30 August 2007
 |
Abstract |
|---|
Simulation is an invaluable tool for investigating the effects of various population genetics modelling assumptions on resulting patterns of genetic diversity, and for assessing the performance of statistical techniques, for example those designed to detect and measure the genomic effects of selection. It is also used to investigate the effectiveness of various design options for genetic association studies. Backwards-in-time simulation methods are computationally efficient and have become widely used since their introduction in the 1980s. The forwards-in-time approach has substantial advantages in terms of accuracy and modelling flexibility, but at greater computational cost. We have developed flexible and efficient simulation software and a rescaling technique to aid computational efficiency, that together allow the simulation of sequence-level data over large genomic regions in entire diploid populations under various scenarios for demography, mutation, selection and recombination, the latter including hotspots and gene conversion. Our FREGENE software is freely available from www.ebi.ac.uk/projects/BARGEN together with an ancillary program to generate phenotype labels, either binary or quantitative. In this paper we discuss limitations of coalescent-based simulation, introduce the rescaling technique that makes large-scale forwards in time simulation feasible, and demonstrate the utility of various features of FREGENE, many not previously available.
Key Words:
association studies, forward simulation, gene conversion, molecular evolution,, selection