Learning Bayesian networks from postgenomic data with an improved structure MCMC sampling scheme
Description
Our paper contributes to recent research on sampling Bayesian network structures from the poste-
rior distribution with MCMC. Two principled paradigms have been applied in the past. Structure
MCMC, first proposed by Madigan and York, defines a Markov chain in the space of graph struc-
tures by applying basic operations to individual edges of the graph, like the creation, deletion
or reversal of an edge. Alternatively, order MCMC, proposed by Friedman and Koller, defines a
Markov chain in the space of node orders. While the second approach has been found to sub-
stantially improve the mixing and convergence of the Markov chain, it does not allow an explicit
specification of the prior distribution over graph structures or, to phrase this difierently, it incurs a
distortion of the specified prior distribution as a consequence of the marginalization over node or-
ders. This distortion can lead to problems for applications in systems biology, where owing to the
limited number of experimental conditions the integration of biological prior knowledge into the
inference scheme becomes desirable. Diferent approaches and modifications have been developed
in the literature to address this shortcoming (e.g. by Ellis, Eaton and Murphy). Unfortunately,
these methods incur extra computational costs and are not practically viable for inferring large
networks with more than 20 to 30 nodes. There have been suggestions of how to improve the
classical structure MCMC approach by using the concept of the inclusion boundary, as proposed
by Castelo and Kocka, but these methods only partially address the convergence and mixing
problems. In the present paper we propose a novel structure MCMC scheme, which augments the
classical structure MCMC method of Madigan and York with a novel edge reversal move. The
idea of the new move is to resample the parent sets of the two nodes involved in such a way that
the selected edge is reversed subject to the acyclicity constraint. The proposal of the new parent
sets is done efectively by adopting ideas from importance sampling; in this way faster convergence
is efected. For methodological consistency, and in contrast to inclusion-driven MCMC, we have
properly derived the Hastings factor, which is a function of various partition functions that are
straightforward to compute. The resulting Markov chain is reversible, satisfies the condition of
detailed balance, and is hence guaranteed to theoretically converge to the desired posterior dis-
tribution. For our empirical evaluation, we have tested our method on various data sets from
the UCI repository, such as Vote, Flare, Boston Housing, and Alarm, which have previously been
used by Friedman and Koller to demonstrate that order MCMC outperforms structure MCMC.
Our experimental results show that integrating the novel edge reversal move yields a substantial
improvement of the resulting MCMC sampler over classical structure MCMC, with convergence
and mixing properties that are similar to those of order MCMC. To demonstrate the avoidance
of the distortional effect incurred with order MCMC, we have extended our empirical evaluation
by analysing ow cytometry protein concentrations from the Raf-Mek-Erk signalling pathway.
The experimental results show that the novel MCMC scheme can lead to a slight yet significant
performance improvement over order MCMC when explicit prior knowledge is integrated into
the learning scheme. This suggests that the avoidance of any systematic distortion of the prior
probability distribution on network structures renders our improved structure MCMC sampler
preferable to order MCMC, especially for those contemporary systems biology applications where
the number of experimental conditions relative to the complexity of the investigated system, and
hence the weight of the likelihood, is relatively low, and explicit prior knowledge about network
structures from publicly accessible data bases is included.
| Slides | |
| 0:00 | Learning Bayesian networks from postgenomic data with an improved structure MCMC sampling scheme |
| 0:00 | Learning Bayesian networks - 1 |
| 0:12 | Learning Bayesian networks - 2 |
| 0:31 | Bayesian networks |
| 1:19 | Learning Bayesian networks - 3 |
| 1:44 | Learning Bayesian networks - 4 |
| 1:48 | Learning Bayesian networks - 3 |
| 1:52 | Learning Bayesian networks - 4 |
| 2:00 | Learning Bayesian networks - 5 |
| 2:09 | MCMC in structure space |
| 2:23 | Alternative paradigm: Order MCMC - 1 |
| 2:37 | Alternative paradigm: Order MCMC - 2 |
| 4:23 | Alternative paradigm: Order MCMC - 3 |
| 4:38 | Instead of MCMC in structure space |
| 4:51 | MCMC in order space |
| 4:57 | MCMC method - 1 |
| 5:03 | MCMC method - 2 |
| 5:43 | Problem: Distortion of the prior distribution |
| 5:50 | Distortion of the prior distribution - 1 |
| 6:18 | Distortion of the prior distribution - 2 |
| 6:34 | Distortion of the prior distribution - 3 |
| 6:39 | Distortion of the prior distribution - 4 |
| 6:44 | Distortion of the prior distribution - 5 |
| 7:22 | Proposed new paradigm |
| 7:34 | First idea - 1 |
| 8:11 | First idea - 2 |
| 8:16 | First idea - 1 |
| 8:18 | First idea - 2 |
| 8:34 | Problem: This move is not reversible |
| 9:19 | Devise a simpler move that is reversible - 1 |
| 9:58 | Devise a simpler move that is reversible - 2 |
| 10:16 | Devise a simpler move that is reversible - 3 |
| 10:21 | Devise a simpler move that is reversible - 4 |
| 10:37 | Simple idea |
| 11:04 | Acceptance probability |
| 11:34 | Ergodicity |
| 12:24 | Evaluation |
| 12:44 | Results - 1 |
| 12:46 | Analytical comparison of the convergence properties - 1 |
| 12:59 | Analytical comparison of the convergence properties - 2 |
| 13:03 | Analytical comparison of the convergence properties - 1 |
| 13:06 | Analytical comparison of the convergence properties - 2 |
| 13:23 | Analytical comparison of the convergence properties - 3 |
| 13:46 | Results - 2 |
| 13:55 | Empirical comparison of the convergence and mixing properties - 1 |
| 14:31 | Empirical comparison of the convergence and mixing properties - 2 |
| 14:43 | Empirical comparison of the convergence and mixing properties - 3 |
| 15:10 | Empirical comparison of the convergence and mixing properties - 4 |
| 15:25 | Empirical comparison of the convergence and mixing properties - 5 |
| 15:26 | What are the implications for network reconstruction? |
| 15:30 | REV structure |
| 16:17 | Conclusion - 1 |
| 16:26 | Conclusion - 2 |
| 16:31 | Results - 3 |
| 16:35 | Evaluation of the systematic bias using standard benchmark data |
| 16:59 | Deviations between true and estimated directed edge feature posterior probabilities |
| 17:45 | Results - 4 |
| 17:51 | Raf regulatory network |
| 18:15 | Raf signalling pathway |
| 18:17 | Data |
| 18:19 | Flow cytometry data |
| 18:36 | Prior knowledge |
| 18:37 | Prior knowledge from KEGG |
| 18:43 | Estimating gene networks from gene expression data |
| 18:50 | Biological prior knowledge |
| 19:15 | Energy of a network |
| 19:28 | Prior knowledge |
| 20:30 | AUROC scores |
| 20:40 | Prior knowledge |
| 20:46 | AUROC scores |
| 21:31 | Conclusion |
| 21:35 | AUROC scores |
| 21:39 | Conclusions |
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Related content
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !





