Many modern genomic data analysis require implementing regressions where the number of parameters (p, e.g., the number of marker effects) exceeds sample size (n). Implementing these large-p-with-small-n regressions poses several statistical and computational challenges, some of which can be confronted using Bayesian methods. This approach allows integrating various parametric and non-parametric shrinkage and variable selection procedures in a unified and consistent manner. The BGLR R-package implements a large collection Bayesian regression models, including parametric variable selection and shrinkage methods and semi-parametric procedures (Bayesian reproducing kernel Hilbert spaces regressions, RKHS). The software was originally developed for genomic applications; however, the methods implemented are useful for many non-genomic applications as well. The response can be continuous (censored or not) or categorical (either binary, or ordinal). The algorithm is based on a Gibbs Sampler with scalar updates and the implementation takes advantage of efficient compiled C and Fortran routines. In this article we describe the methods implemented in BGLR, present examples of the use of the package and discuss practical issues emerging in real-data analysis.
- Received March 22, 2014.
- Accepted June 26, 2014.
- Copyright © 2014, The Genetics Society of America