Active Learning for Biomedical Citation Screening

author: Byron C. Wallace, Brown Laboratory for Linguistic Information Processing, Brown University
published: Oct. 1, 2010,   recorded: July 2010,   views: 4012


Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.


Active learning (AL) is an increasingly popular strategy for mitigating the amount of labeled data required to train classifi ers, thereby reducing annotator e ffort. We describe a real-world, deployed application of AL to the problem of biomedical citation screening for systematic reviews at the Tufts Evidence-based Practice Center. We propose a novel active learning strategy that exploits a priori domain knowledge provided by the expert (speci fically, labeled features) and extend this model via a Linear Programming algorithm for situations where the expert can provide ranked labeled features. Our methods outperform existing AL strategies on three real-world systematic review datasets. We argue that evaluation must be specifi c to the scenario under consideration. To this end, we propose a new evaluation framework for fi nite-pool scenarios, wherein the primary aim is to label a fixed set of examples rather than to simply induce a good predictive model. We use a method from medical decision theory for eliciting the relative costs of false positives and false negatives from the domain expert, constructing a utility measure of classi fication performance that integrates the expert preferences. Our fi ndings suggest that the expert can, and should, provide more information than instance labels alone. In addition to achieving strong empirical results on the citation screening problem, this work outlines many important steps for moving away from simulated active learning and toward deploying AL for real-world applications.

See Also:

Download slides icon Download slides: kdd2010_wallace_albc_01.pdf (276.0 KB)

Download slides icon Download slides: kdd2010_wallace_albc_01.ppt (561.0 KB)

Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: