Boosting Active Learning to Optimality: a Tracable Monte-Carlo, Billiard-Based Algorithm
published: Oct. 20, 2009, recorded: September 2009, views: 3208
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
This paper focuses on Active Learning with a limited number of queries; in application domains such as Numerical Engineering, the size of the training set might be limited to a few dozen or hundred examples due to computational constraints. Active Learning under bounded resources is formalized as a finite horizon Reinforcement Learning problem, where the sampling strategy aims at minimizing the expectation of the generalization error. A tractable approximation of the optimal (intractable) policy is presented, the Bandit-based Active Learner (Baal) algorithm. Viewing Active Learning as a single-player game, Baal combines UCT, the tree structured multi-armed bandit algorithm proposed by Kocsis and Szepesvari (2006), and billiard algorithms. A proof of principle of the approach demonstrates its good empirical convergence toward an optimal policy and its ability to incorporate prior AL criteria. Its hybridization with the Query-by-Committee approach is found to improve on both stand-alone Baal and stand-alone QbC.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !