I moved into the field of human complex trait genetics less than 20 years ago, from a background in quantitative genetics and animal breeding. Even in this period of time, major changes have occurred that were hard to predict back in the 1990s. Driven by enormous advances in DNA sequencing technologies, one can now sequence and analyze an entire human genome for a few thousand dollars. Some may argue that the cost of a sequenced genome is much lower than that, but that usually ignores the expense of storage, analysis, and interpretation. Sequencing technology has facilitated easy and fast discovery of Mendelian disease mutations and coding variants with high penetrance (a high probability of disease given genotype), and has led to precise estimates of the per-generation mutation rate (1000 Genomes Project Consortium et al. 2010). In the same period, development of array genotyping technology has made it possible to genotype hundreds of thousands of DNA variants for less than $100. Millions of samples have been genotyped using such arrays to study the genetic basis of complex traits such as common disease and quantitative traits, which has led to the discovery of many thousands of genes, gene variants, and biological pathways that are associated with one or more complex traits (Visscher et al. 2012). The traits vary widely, from psychiatric disorders to autoimmune disease, cancer, anthropometric traits such as height and weight, traits measured in blood such as platelet size and counts, and behavioral traits such as intelligence and years of schooling. In addition to trait-variant discovery, the technologies have led to new discoveries in human evolution and population genetics.
These mostly unpredicted rapid developments were not just taking place in human complex trait genetics. In plant and animal breeding, a revolution has been taking place in the last 15 years. In 2001, a theoretical paper in this journal showed that with a sufficiently dense marker map, linkage disequilibrium could be exploited to predict breeding values and speed up genetic gain by radically changing the structure of breeding programs (Meuwissen et al. 2001). This paper was published well before the first commercial SNP chips were available, and within 10 years of publication, the method, called “genomic selection” (or “genomic prediction”), was implemented in dairy cattle breeding programs around the world; breeders of other livestock species and crops are following the same route. The update of this technology has led to a doubling of the rate of genetic gain in dairy cattle (Veerkamp 2015), an astounding increase and an incredibly rapid update of new technology.
My main thesis is that the relentless pace of technological innovation will cause a change in how science is conducted. Instead of the model-based hypothesis-testing science that dominated the last century, the next will be hypothesis-generating-discovery science that is driven by data. I believe that this change will not be confined to human complex trait genetics, but will apply to all areas of research in genetics. Genomics will become synonymous with biology, a trend already occurring.
Genetic Data Will Not Be Limiting
A conservative prediction is that genetic data will not be a limiting factor in answering fundamental questions about the evolution and nature of complex trait variation in human populations. The cost of generating a whole genome sequence is still going down, and it is not inconceivable that a majority of people on the planet will have their genome sequenced in 50 years’ time. What will limit our ability to find answers to questions about genome–phenome relationships is the availability of high-quality, in-depth phenotypic and environmental information to link to the genetic data. But even the phenome might become tractable with better technologies, such as smart sensors and devices that track behavior, physiology, and the environment in real time. With gigantic sample sizes, it will be possible to explain most, if not all, additive genetic variation for a range of traits and to tackle old questions about the nature of mutational variance, the maintenance of genetic variation, the genetic control of variability, and the elusive quantification of variation due to nonadditive and genotype-by-environment (G × E) interactions.
I predict that the tens of millions of single nucleotide variants and the many copy number variants that currently segregate in the population will be whittled down to a much more manageable credible set of plausible causal variants. I am agnostic as to what the size of that set is going to be (ten thousand? a hundred thousand? one million?). The phenome will not only consist of continuous measurements on individuals such as physical activity, heart rate, and blood pressure, but will also include genome-wide nonsequenced-based “omics” data such as gene expression and epigenetic modifications. Sophisticated data-driven multivariate algorithms are likely to be developed that will enable prediction of the consequence (if any) on the phenome of a de novo mutation in the context of a person’s genome. A credible set of causal variants is likely to provide new insight into pleiotropy, for example, by quantifying the contributions to genetic covariance by functional annotation and by quantifying the joint distribution of effect sizes on different traits, even when their genome-wide genetic correlation is zero.
If most additive genetic variation is accounted for by known variants, then additive by additive variance can be quantified, and similarly the interaction (or lack thereof) between identified environmental factors and additive genetic values. Differentiating between genotypic and additive variation will remain problematic for highly polygenic traits because there will be too many unique genotypes for their values to be estimated accurately, and theory predicts that for highly polygenic traits most genetic variation will be additive anyway (Maki-Tanila and Hill 2014).
In Osteo Population Genetics Studies?
Population genetics studies, including those applied to human populations, were founded on sophisticated mathematical models of changes in gene frequencies geographically and over time. Until recently, genetic data were limiting and largely constrained to observed allele frequencies between and within populations. This has changed drastically in the last 10 years because of the availability of SNP arrays and genome sequences, leading to the identification of several loci and variants that have been under natural selection. DNA extraction and sequencing technology have improved to the extent that partial genome sequences of Neanderthals have been generated, and SNP data have been acquired from recent ancestors living in Europe 3000 to 8000 years ago (Haak et al. 2015), drawing inference about natural selection in the past 8000 years (Mathieson et al. 2015). I predict that the technologies will develop further and that, in principle, it will be possible to take bone samples from a number of individuals who lived 100, 200, ... 10,000 years ago and infer recent natural selection as if it was in real time by tracking changes in allele frequencies of variants that are known (from modern day studies) to be associated with complex traits and fitness. It might even be possible to study G × E interaction by performing gene mapping on ancestral samples, for example on femur lengths (which is a highly heritable complex trait). Dig up the bodies!
Modeling Human Complex Traits in Experimental Organisms Will Become Obsolete
Model organisms such as fruit flies, mice, and worms have been at the forefront of major discoveries in genetics over the last century. Many if not most of these discoveries were about mechanisms, e.g., mechanisms of natural selection, speciation, recombination, imprinting, response to selection, and gene function. Experimental organisms have been less successful in modeling human disease (in the sense of leading to successful prevention or treatment), even, for example, when engineered mutations in mice are identical to those discovered in human patients. My prediction for future research into human disease causes and drug discovery is that humans will become a “model organism” through exploiting new technologies such as tissue-specific cell lines and gene editing.
I would also argue that model organisms have been largely unsuccessful in modeling complex traits in general, whether for proposed applications in human health or for potential applications in plant and animal breeding. Progress in livestock genetics has come from studying complex traits in cattle, pigs, and poultry, not from studying crosses between inbred lines of mice. Similarly, progress in understanding disease in humans has largely come from studying those diseases in humans and not from building models of them in other species. Indeed, the rapid developments in human complex trait genetics over the last 10 years have outshone those in, e.g., mice or flies. There are exceptions, of course, but they are not common.
Personalized Genetics and Genomics Will Become an Integral Part of Health Care and Clinical Practice
One major application of studying complex traits in humans is in medicine. Indeed, most of the public funding to study complex traits in human populations has come from medical research funding bodies such as the National Institutes of Health, the Wellcome Trust, and the Medical Research Council. Genetic technologies, including genome sequencing, have already led to changes in clinical practice, for example by personalizing drug advice for cancer depending on the tumor’s genomes. I believe the very near future will see this extended to diagnosis of Mendelian disease and to providing more refined personalized treatment advice for cancer.
The bigger question for the future is how to extend this to common diseases and traits, which provide the largest personal, health, and economic burden on society. I predict major changes in how health care will be managed through a person’s lifetime, using personalized genetic- and genomic-based information (including metabolomic, proteomic, and microbiome data) combined with phenome-tracking information from smart electronic devices. The bottleneck to make this happen will be in the collection and analysis of relevant data. It is telling that in 2015, both Google and Apple are seeing health and medicine as a major field of interest.
Communicating editor: M. Johnston
- Received October 5, 2015.
- Accepted October 18, 2015.
- Copyright © 2016 by the Genetics Society of America
Available freely online through the author-supported open access option.