Estimating the contribution of non-genetic factors to gene expression using Gaussian process latent variable models

author: Nicolò Fusi, School of Computer Science, The University of Manchester
published: May 3, 2010,   recorded: March 2010,   views: 119
You might be experiencing some problems with Your Video player.

Slides

Slides
0:00 Estimating the contribution of non-genetic factors to gene expression using Gaussian Process Latent Variable Models
0:22 Outline -1
0:39 Outline -2
0:41 Expression Quantitative Trait Loci - eQTL -1
0:49 Expression Quantitative Trait Loci - eQTL -2
0:55 Expression Quantitative Trait Loci - eQTL -3
1:02 Outline -3
1:05 Single Nucleotide Polymorphisms -1
1:09 Single Nucleotide Polymorphisms -2
1:17 Single Nucleotide Polymorphisms -3
1:24 The Hapmap dataset -1
1:37 The Hapmap dataset -2
1:42 The Hapmap dataset -3
1:52 Project GENEVAR - GENe Expression VARiation -1
2:06 Project GENEVAR - GENe Expression VARiation -2
2:07 Project GENEVAR - GENe Expression VARiation -3
2:09 Confounding factors -2
2:09 Outline -4
2:15 Confounding factors -1
2:19 Confounding factors -2
2:21 Confounding factors -3
2:28 Modelling non-genetic factors -1
2:36 Modelling non-genetic factors -2
2:41 Modelling non-genetic factors -3
2:43 Modelling non-genetic factors -3
2:46 Modelling non-genetic factors -4
2:49 Modelling non-genetic factors -5
2:57 Modelling non-genetic factors -6
3:01 Modelling non-genetic factors -7
3:06 Modelling non-genetic factors -8
3:13 Modelling non-genetic factors -9
3:16 dual Probabilistic Principal Component Analysis -1
4:03 dual Probabilistic Principal Component Analysis -2
4:11 dual Probabilistic Principal Component Analysis -3
4:27 dual Probabilistic Principal Component Analysis -4
4:37 dual Probabilistic Principal Component Analysis -5
4:46 Population structure
5:00 Accounting for population structure -1
5:14 Accounting for population structure -2
5:20 Accounting for population structure -3
5:34 Outline -5
5:35 eQTL scan using data from Hapmap and GENEVAR
5:58 Traditional eQTL scan - 1
6:04 eQTL scan accounting for non-genetic factors
6:08 Traditional eQTL scan - 2
6:16 eQTL scan accounting for non-genetic factors
7:23 Outline -6
7:26 Conclusions
9:36 Project GENEVAR - GENe Expression VARiation -3

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
 
    Delicious Bibliography

Description

Thanks to the recent increase in the amount of genetic profiling data available and to the ability to characterize disease activity through gene expression, it is possible to understand more in detail the multitude of causal factors linked with each disease. This is a challenging task because the integration of different sources of biological data is not straightforward and because non-genetic factors (such as differences in the experimental setting or individual characteristics such as gender and ethnicity) are not always artificially controlled. Since these non-genetic factors may cause most of the variation in gene-expression reducing the accuracy of genetic studies, there’s a pressing need for models that take them explicitly into account. We present a model in which non-genetic factors are unobserved latent variables the gene expression levels can be described as linear functions of both these latent variables and Single Nucleotide Polymorphisms (SNPs). From a generative point of view, we can see the gene expression levels Y as

Y = SV + XW +mu 1^T + epsilon

Where S is the matrix containing the SNPs, X are the latent variables, V and W are mapping matrices, is a Gaussian distributed isotropic error model and mu allows the model to have non-zero mean.

The model is inspired by the one proposed by Stegle et al. [1], but instead of optimizing parameters and marginalising latent variables (as in Probabilistic PCA), we marginalise the parameters and optimize the latent variables. For a particular choice of prior over the mapping matrices W and V the two approaches are equivalent.

This kind of model is called dual Probabilistic PCA and it belongs to a wider class of models called Gaussian Process - Latent Variable Models. Indeed, dual PPCA is the special case where the output dimensions are assumed to be linear, independent and identically distributed. Each of these assumptions can be relaxed obtaining new probabilistic models. Many extensions of this model are possible, but even in its simplest form the eQTL study results are extremely promising in terms of number of significant associations found.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: