Biomedical and Translational Informatics Laboratory

  • Generating Linkage Disequilibrium Patterns in Data Simulations Using genomeSIMLA Conference Paper, Todd L. Edwards, William S. Bush, Stephen D. Turner, Scott M. Dudek, Eric S. Torstenson, Mike Schmidt, Eden Martin, Marylyn D. Ritchie, http://dl.acm.org/citation.cfm?id=1792674.1792677, EvoBIO'08, Berlin, Heidelberg, Springer-Verlag, 24–35, 3-540-78756-9, 978-3-540-78756-3, 2008, 2014-02-24 18:00:08, ACM Digital Library, Whole-genome association (WGA) studies are becoming a common tool for the exploration of the genetic components of common disease. The analysis of such large scale data presents unique analytical challenges, including problems of multiple testing, correlated independent variables, and large multivariate model spaces. These issues have prompted the development of novel computational approaches. Thorough, extensive simulation studies are a necessity for methods development work to evaluate the power and validity of novel approaches. Many data simulation packages exist, however, the resulting data is often overly simplistic and does not compare to the complexity of real data; especially with respect to linkage disequilibrium (LD). To overcome this limitation, we have developed genomeSIMLA. GenomeSIMLA is a forward-time population simulation method that can simulate realistic patterns of LD in both family-based and case-control datasets. In this manuscript, we demonstrate how LD patterns of the simulated data change under different population growth curve parameter initialization settings. These results provide guidelines to simulate WGA datasets whose properties resemble the HapMap., Proceedings of the 6th European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics,