Sparse Linear Models Explain Phenotypic Variation and Predict Risk of Complex Disease
published: Jan. 23, 2012, recorded: December 2011, views: 156
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
A central goal of medical genetics is to create models that accurately predict complex disease given genotype. To maximize predictive value and identify causal single-nucleotide polymorphisms (SNPs), all SNPs should be modeled simultaneously. Lasso penalized models have proven to be a useful class of such models, for detecting causal SNPs and for modeling disease risk. Here, we present a comprehensive analysis of real case/control data using lasso-penalized models. Our models accurately discriminated cases from controls in celiac disease and type 1 diabetes, and strongly replicated across independent datasets with validation AUC of 0.84 for type 1 diabetes and 0.82–0.9 for celiac disease, the latter across four independent datasets of different European ethnicities. The models also explained substantial phenotypic variance in independent validation: 22% for type 1 diabetes and 21–38% for celiac disease. This study shows that supervised learning approaches can address missing phenotypic variance and reliably predict incidence of celiac disease and type 1 diabetes from genotype.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !