Poster: Knowledge as a Constraint on Uncertainty for Unsupervised Classification: A Study in Part-of-Speech Tagging
published: Aug. 11, 2008, recorded: July 2008, views: 2998
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
This paper evaluates the use of prior knowledge to limit or bias the choices of a classifer during otherwise unsupervised training and classifcation. Focusing on effects in the uncertainty of the model's decisions, we quantify the contributions of the knowledge source as a reduction in the conditional entropy of the label distribution given the input corpus. Allowing us to compare diffrent sets of knowledge without annotated data, we find that label entropy is highly predictive of final performance for a standard Hidden Markov Model (HMM) on the task of part-of-speech tagging. Our results show too that even basic levels of knowledge, integrated as labeling constraints, have considerable effect on classification accuracy, in addition to more stable and effcient training convergence. Finally, for cases where the model's internal classes need to be interpreted and mapped to a de- sired label set, we find that, for constrained models, the requirements for annotated data to make quality assignments are greatly reduced.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !