Originally published as Genetics Published Articles Ahead of Print on February 4, 2007.

Genetics, Vol. 175, 1813-1822, April 2007, Copyright © 2007
doi:10.1534/genetics.106.066530

Evolutionary Framework for Protein Sequence Evolution and Gene Pleiotropy

Department of Genetics, Development and Cell Biology, Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA 50011

1 Address for correspondence: Department of Genetics, Developmental and Cell Biology, 536 Science II Hall, Iowa State University, Ames, IA 50011.
E-mail: xgu{at}iastate.edu

In this article, we develop an evolutionary model for protein sequence evolution. Gene pleiotropy is characterized by K distinct but correlated components (molecular phenotypes) that affect the organismal fitness. These K molecular phenotypes are under stabilizing selection with microadaptation (SM) due to random optima shifts, the SM model. Random coding mutations generate a correlated distribution of K molecular phenotypes. Under this SM model, we further develop a statistical method to estimate the "effective" number of molecular phenotypes (Ke) of the gene. Therefore, for the first time we can empirically evaluate gene pleiotropy from the protein sequence analysis. Case studies of vertebrate proteins indicate that Ke is typically ~6–9. We demonstrate that the newly developed SM model of protein evolution may provide a basis for exploring genomic evolution and correlations.