## Inference Complexity as Learning Bias

author: Pedro Domingos, Dept. of Computer Science & Engineering, University of Washington
published: Jan. 15, 2009,   recorded: November 2008,   views: 165
You might be experiencing some problems with Your Video player.

# Slides

0:00 Slides - Announcement Inference Complexity As Learning Bias Don’t use model complexity as your learning bias … The Goal Outline Solution 1: Exact Inference Solution 2: Approximate Inference Solution 3: Learn a tractable model Thin junction trees are thin Solution 3: Learn a tractable model Outline Bayesian networks …With decision-tree CPDs …Compiled to circuits Arithmetic circuits ACs for Inference BN Structure Learning Key Idea Basic algorithm Efficiency Algorithm Before split After split After split ... How to split a circuit Optimizations Experiments Learning time Inference time Accuracy: EachMovie Inference time Accuracy: EachMovie Accuracy: MSWeb Accuracy: KDD Cup Conclusion - Questions - Questions - Questions - Questions - Questions - Questions - Questions

# Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

# Description

Graphical models are usually learned without regard to the cost of doing inference with them. As a result, even if a good model is learned, it may perform poorly at prediction, because it requires approximate inference. We propose an alternative: learning models with a score function that directly penalizes the cost of inference. Specifically, we learn arithmetic circuits with a penalty on the number of edges in the circuit (in which the cost of inference is linear). Our algorithm is equivalent to learning a Bayesian network with context-specific independence by greedily splitting conditional distributions, at each step scoring the candidates by compiling the resulting network into an arithmetic circuit, and using its size as the penalty. We show how this can be done efficiently, without compiling a circuit from scratch for each candidate. Experiments on several real-world domains show that our algorithm is able to learn tractable models with very large treewidth, and yields more accurate predictions than a standard context-specific Bayesian network learner, in far less time. (Joint work with Daniel Lowd.)