Biomedical and Translational Informatics Laboratory

Publications - 2011

  • Using biological knowledge to uncover the mystery in the search for epistasis in genome-wide association studies. Marylyn D. Ritchie, (c) 2010 The Author Annals of Human Genetics (c) 2010 Blackwell Publishing Ltd/University College London., 75, 1, 172-182, Annals of human genetics, 2011 Jan, PMID: 21158748 PMCID: PMC3092784,
  • Quality control procedures for genome-wide association studies. Stephen Turner, Loren L. Armstrong, Yuki Bradford, Christopher S. Carlson, Dana C. Crawford, Andrew T. Crenshaw, Mariza de Andrade, Kimberly F. Doheny, Jonathan L. Haines, Geoffrey Hayes, Gail Jarvik, Lan Jiang, Iftikhar J. Kullo, Rongling Li, Hua Ling, Teri A. Manolio, Martha Matsumoto, Catherine A. McCarty, Andrew N. McDavid, Daniel B. Mirel, Justin E. Paschall, Elizabeth W. Pugh, Luke V. Rasmussen, Russell A. Wilke, Rebecca L. Zuvich, Marylyn D. Ritchie, (c) 2011 by John Wiley & Sons, Inc., Chapter 1, Current protocols in human genetics / editorial board, Jonathan L. Haines ... [et al.], 2011 Jan, PMID: 21234875 PMCID: PMC3066182,
  • Statistical Optimization of Pharmacogenomics Association Studies: Key Considerations from Study Design to Analysis. Benjamin J. Grady, Marylyn D. Ritchie, 9, 1, Current pharmacogenomics and personalized medicine, 2011 Mar 1, PMID: 21887206 PMCID: PMC3163263,
  • Visual integration of results from a large DNA biobank (BioVU) using synthesis-view. Sarah Pendergrass, Scott M. Dudek, Dan M. Roden, Dana C. Crawford, Marylyn D. Ritchie, 265-275, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, 2011, PMID: 21121054 PMCID: PMC3065108,
  • Learning Phenotype Mapping for Integrating Large Genetic Data Conference Paper, Chun-Nan Hsu, Cheng-Ju Kuo, Congxing Cai, Sarah A. Pendergrass, Marylyn D. Ritchie, Jose Luis Ambite, http://dl.acm.org/citation.cfm?id=2002902.2002906, BioNLP '11, Stroudsburg, PA, USA, Association for Computational Linguistics, 19–27, 978-1-932432-91-6, 2011, 2014-02-24 18:08:19, ACM Digital Library, Accurate phenotype mapping will play an important role in facilitating Phenome-Wide Association Studies (PheWAS), and potentially in other phenomics based studies. The Phe-WAS approach investigates the association between genetic variation and an extensive range of phenotypes in a high-throughput manner to better understand the impact of genetic variations on multiple phenotypes. Herein we define the phenotype mapping problem posed by PheWAS analyses, discuss the challenges, and present a machine-learning solution. Our key ideas include the use of weighted Jaccard features and term augmentation by dictionary lookup. When compared to string similarity metric-based features, our approach improves the F-score from 0.59 to 0.73. With augmentation we show further improvement in F-score to 0.89. For terms not covered by the dictionary, we use transitive closure inference and reach an F-score of 0.91, close to a level sufficient for practical use. We also show that our model generalizes well to phenotypes not used in our training dataset., Proceedings of BioNLP 2011 Workshop,
  • The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies. Catherine A. McCarty, Rex L. Chisholm, Christopher G. Chute, Iftikhar J. Kullo, Gail P. Jarvik, Eric B. Larson, Rongling Li, Daniel R. Masys, Marylyn D. Ritchie, Dan M. Roden, Jeffery P. Struewing, Wendy A. Wolf, 4, BMC medical genomics, 2011, PMID: 21269473 PMCID: PMC3038887,
  • Genetic analysis of biological pathway data through genomic randomization. Brian L. Yaspan, William S. Bush, Eric S. Torstenson, Deqiong Ma, Margaret A. Pericak-Vance, Marylyn D. Ritchie, James S. Sutcliffe, Jonathan L. Haines, 129, 5, 563-571, Human genetics, 2011 May, PMID: 21279722 PMCID: PMC3107984,
  • The Next PAGE in understanding complex traits: design for the analysis of Population Architecture Using Genetics and Epidemiology (PAGE) Study. Tara C. Matise, Jose Luis Ambite, Steven Buyske, Christopher S. Carlson, Shelley A. Cole, Dana C. Crawford, Christopher A. Haiman, Gerardo Heiss, Charles Kooperberg, Loic Le Marchand, Teri A. Manolio, Kari E. North, Ulrike Peters, Marylyn D. Ritchie, Lucia A. Hindorff, Jonathan L. Haines, 174, 7, 849-859, American journal of epidemiology, 2011 Oct 1, PMID: 21836165 PMCID: PMC3176830,
  • Rule based classifier for the analysis of gene-gene and gene-environment interactions in genetic association studies. Thorsten Lehr, Jing Yuan, Dirk Zeumer, Supriya Jayadev, Marylyn D. Ritchie, 4, BioData mining, 2011, PMID: 21362183 PMCID: PMC3060133,
  • European mitochondrial DNA haplogroups and metabolic changes during antiretroviral therapy in AIDS Clinical Trials Group Study A5142. Todd Hulgan, Richard Haubrich, Sharon A. Riddler, Pablo Tebas, Marylyn D. Ritchie, Grace A. McComsey, David W. Haas, Jeffrey A. Canter, 25, 1, AIDS (London, England), 2011 Jan 2, PMID: 20871389 PMCID: PMC2995830,
  • The effects of linkage disequilibrium in large scale SNP datasets for MDR. Benjamin J. Grady, Eric S. Torstenson, Marylyn D. Ritchie, 4, BioData mining, 2011, PMID: 21545716 PMCID: PMC3108918,
  • Pharmacogenomics of HIV therapy: summary of a workshop sponsored by the National Institute of Allergy and Infectious Diseases. David W. Haas, Daniel R. Kuritzkes, Marylyn D. Ritchie, Shashi Amur, Brian F. Gage, Gary Maartens, Dan Masys, Jacques Fellay, Elizabeth Phillips, Heather J. Ribaudo, Kenneth A. Freedberg, Christos Petropoulos, Teri A. Manolio, Roy M. Gulick, Richard Haubrich, Peter Kim, Marjorie Dehlinger, Rahel Abebe, Amalio Telenti, 12, 5, 277-285, HIV clinical trials, 2011 Sep-Oct, PMID: 22180526 PMCID: PMC3322423,
  • Knowledge-driven multi-locus analysis reveals gene-gene interactions influencing HDL cholesterol level in two independent EMR-linked biobanks. Stephen D. Turner, Richard L. Berg, James G. Linneman, Peggy L. Peissig, Dana C. Crawford, Joshua C. Denny, Dan M. Roden, Catherine A. McCarty, Marylyn D. Ritchie, Russell A. Wilke, 6, 5, PloS one, 2011, PMID: 21589926 PMCID: PMC3092760,
  • Use of biological knowledge to inform the analysis of gene-gene interactions involved in modulating virologic failure with efavirenz-containing treatment regimens in ART-naive ACTG clinical trials participants. Benjamin J. Grady, Eric S. Torstenson, Paul J. McLaren, Paul I. W. DE Bakker, David W. Haas, Gregory K. Robbins, Roy M. Gulick, Richard Haubrich, Heather Ribaudo, Marylyn D. Ritchie, 253-264, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, 2011, PMID: 21121053 PMCID: PMC3094912,
  • Facilitating pharmacogenetic studies using electronic health records and natural-language processing: a case study of warfarin. Hua Xu, Min Jiang, Matt Oetjens, Erica A. Bowton, Andrea H. Ramirez, Janina M. Jeff, Melissa A. Basford, Jill M. Pulley, James D. Cowan, Xiaoming Wang, Marylyn D. Ritchie, Daniel R. Masys, Dan M. Roden, Dana C. Crawford, Joshua C. Denny, 18, 4, 387-391, of the American Medical Informatics Association : JAMIA, 2011 Jul-Aug, PMID: 21672908 PMCID: PMC3128409,
  • Model-based multifactor dimensionality reduction for detecting epistasis in case-control data in the presence of noise. Tom Cattaert, M. Luz Calle, Scott M. Dudek, Jestinah M. Mahachie John, Francois Van Lishout, Victor Urrea, Marylyn D. Ritchie, Kristel Van Steen, (c) 2010 The Authors Annals of Human Genetics (c) 2010 Blackwell Publishing Ltd/University College London., 75, 1, Annals of human genetics, 2011 Jan, PMID: 21158747 PMCID: PMC3059142,
  • The use of phenome-wide association studies (PheWAS) for exploration of novel genotype-phenotype relationships and pleiotropy discovery. S. A. Pendergrass, K. Brown-Gentry, S. M. Dudek, E. S. Torstenson, J. L. Ambite, C. L. Avery, S. Buyske, C. Cai, M. D. Fesinmeyer, C. Haiman, G. Heiss, L. A. Hindorff, C.-N. Hsu, R. D. Jackson, C. Kooperberg, L. Le Marchand, Y. Lin, T. C. Matise, L. Moreland, K. Monroe, A. P. Reiner, R. Wallace, L. R. Wilkens, D. C. Crawford, M. D. Ritchie, (c) 2011 Wiley-Liss, Inc., 35, 5, 410-422, Genetic epidemiology, 2011 Jul, PMID: 21594894 PMCID: PMC3116446,
  • A knowledge-driven interaction analysis reveals potential neurodegenerative mechanism of multiple sclerosis susceptibility. W. S. Bush, J. L. McCauley, P. L. DeJager, S. M. Dudek, D. A. Hafler, R. A. Gibson, P. M. Matthews, L. Kappos, Y. Naegelin, C. H. Polman, S. L. Hauser, J. Oksenberg, J. L. Haines, M. D. Ritchie, 12, 5, 335-340, Genes and immunity, 2011 Jul, PMID: 21346779 PMCID: PMC3136581,
  • Complement receptor 1 gene variants are associated with erythrocyte sedimentation rate. Iftikhar J. Kullo, Keyue Ding, Khader Shameer, Catherine A. McCarty, Gail P. Jarvik, Joshua C. Denny, Marylyn D. Ritchie, Zi Ye, David R. Crosslin, Rex L. Chisholm, Teri A. Manolio, Christopher G. Chute, Copyright (c) 2011 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved., 89, 1, 131-138, American journal of human genetics, 2011 Jul 15, PMID: 21700265 PMCID: PMC3135803,
  • A phenomics-based strategy identifies loci on APOC1, BRAP, and PLCG1 associated with metabolic syndrome phenotype domains. Christy L. Avery, Qianchuan He, Kari E. North, Jose L. Ambite, Eric Boerwinkle, Myriam Fornage, Lucia A. Hindorff, Charles Kooperberg, James B. Meigs, James S. Pankow, Sarah A. Pendergrass, Bruce M. Psaty, Marylyn D. Ritchie, Jerome I. Rotter, Kent D. Taylor, Lynne R. Wilkens, Gerardo Heiss, Dan Yu Lin, 7, 10, PLoS genetics, 2011 Oct, PMID: 22022282 PMCID: PMC3192835,
  • ATHENA Optimization: The Effect of Initial Parameter Settings across Different Genetic Models Book Section, Emily R. Holzinger, Scott M. Dudek, Eric C. Torstenson, Marylyn D. Ritchie, Clara Pizzuti, Marylyn D. Ritchie, Mario Giacobini, http://link.springer.com/chapter/10.1007/978-3-642-20389-3_5, ©2011 Springer Berlin Heidelberg, Lecture Notes in Computer Science, Springer Berlin Heidelberg, 48-58, 978-3-642-20388-6, 978-3-642-20389-3, 2011/01/01, 2014-02-24 18:07:35, 6623, link.springer.com, Rapidly advancing technology has allowed for the generation of massive amounts data assessing variation across the human genome. One analysis method for this type of data is the genome-wide association study (GWAS) where each variation is assessed individually for association to disease. While these studies have elucidated novel etiology, much of the variation due to genetics remains unexplained. One hypothesis is that some of the variation lies in gene-gene interactions. An impediment to testing for interactions is the infeasibility of exhaustively searching all multi-locus models. Novel methods are being developed that perform a non-exhaustive search. Because these methods are new to genetic studies, rigorous parameter optimization is necessary. Here, we assess genotype encodings, function sets, and cross-over in two algorithms which use grammatical evolution to optimize neural networks or symbolic regression formulas in the ATHENA software package. Our results show that the effect of these parameters is highly dependent on the underlying disease model., Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics, ATHENA Optimization,
  • Pitfalls of merging GWAS data: lessons learned in the eMERGE network and quality control procedures to maintain high data quality. Rebecca L. Zuvich, Loren L. Armstrong, Suzette J. Bielinski, Yuki Bradford, Christopher S. Carlson, Dana C. Crawford, Andrew T. Crenshaw, Mariza de Andrade, Kimberly F. Doheny, Jonathan L. Haines, M. Geoffrey Hayes, Gail P. Jarvik, Lan Jiang, Iftikhar J. Kullo, Rongling Li, Hua Ling, Teri A. Manolio, Martha E. Matsumoto, Catherine A. McCarty, Andrew N. McDavid, Daniel B. Mirel, Lana M. Olson, Justin E. Paschall, Elizabeth W. Pugh, Luke V. Rasmussen, Laura J. Rasmussen-Torvik, Stephen D. Turner, Russell A. Wilke, Marylyn D. Ritchie, (c) 2011 Wiley Periodicals, Inc., 35, 8, 887-898, Genetic epidemiology, 2011 Dec, PMID: 22125226 PMCID: PMC3592376,
  • Mitochondrial genomics and CD4 T-cell count recovery after antiretroviral therapy initiation in AIDS clinical trials group study 384. Benjamin J. Grady, David C. Samuels, Gregory K. Robbins, Doug Selph, Jeffrey A. Canter, Richard B. Pollard, David W. Haas, Robert Shafer, Spyros A. Kalams, Deborah G. Murdock, Marylyn D. Ritchie, Todd Hulgan, 58, 4, 363-370, of acquired immune deficiency syndromes (1999), 2011 Dec 1, PMID: 21792066 PMCID: PMC3204178,
  • Association of haplotypes of inflammation-related genes with gastric preneoplastic lesions in African Americans and Caucasians. Jovanny Zabaleta, Maria C. Camargo, Marylyn D. Ritchie, Maria B. Piazuelo, Rosa A. Sierra, Stephen D. Turner, Alberto Delgado, Elizabeth T. H. Fontham, Barbara G. Schneider, Pelayo Correa, Augusto C. Ochoa, 128, 3, 668-675, International journal of cancer. international du cancer, 2011 Feb 1, PMID: 20473875 PMCID: PMC2964400,
  • Variants near FOXE1 are associated with hypothyroidism and other thyroid conditions: using electronic medical records for genome- and phenome-wide studies. Joshua C. Denny, Dana C. Crawford, Marylyn D. Ritchie, Suzette J. Bielinski, Melissa A. Basford, Yuki Bradford, High Seng Chai, Lisa Bastarache, Rebecca Zuvich, Peggy Peissig, David Carrell, Andrea H. Ramirez, Jyotishman Pathak, Russell A. Wilke, Luke Rasmussen, Xiaoming Wang, Jennifer A. Pacheco, Abel N. Kho, M. Geoffrey Hayes, Noah Weston, Martha Matsumoto, Peter A. Kopp, Katherine M. Newton, Gail P. Jarvik, Rongling Li, Teri A. Manolio, Iftikhar J. Kullo, Christopher G. Chute, Rex L. Chisholm, Eric B. Larson, Catherine A. McCarty, Daniel R. Masys, Dan M. Roden, Mariza de Andrade, Copyright (c) 2011 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved., 89, 4, 529-542, American journal of human genetics, 2011 Oct 7, PMID: 21981779 PMCID: PMC3188836,