Semi-supervised Instance Matching Using Boosted Classifiers
published: July 15, 2015, recorded: June 2015, views: 1555
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Instance matching concerns identifying pairs of instances that refer to the same underlying entity. Current state-of-the-art instance matchers use machine learning methods. Supervised learning systems achieve good performance by training on significant amounts of manually labeled samples. To alleviate the labeling effort, this paper presents a minimally supervised instance matching approach that is able to deliver competitive performance using only 2% training data and little parameter tuning. As a first step, the classifier is trained in an ensemble setting using boosting. Iterative semi-supervised learning is used to improve the performance of the boosted classifier even further, by re-training it on the most confident samples labeled in the current iteration. Empirical evaluations on a suite of six publicly available benchmarks show that the proposed system outcompetes optimization-based minimally supervised approaches in 1–7 iterations. The system’s average F-Measure is shown to be within 2.5% of that of recent supervised systems that require more training samples for effective performance.
Download slides: eswc2015_kejriwal_boosted_classifiers_01.pdf (952.5 KB)
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !