Extracting Meta Statements from the Blogosphere
published: Aug. 18, 2011, recorded: July 2011, views: 3294
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Information extraction systems have been recently proposed for organizing and exploring content in large online text corpora as information networks. In such networks, the nodes are named entities (e.g., people, organizations) while the edges correspond to statements indicating relations among such entities. To date, such systems extract rather primitive networks, capturing only those relations which are expressed by direct statements. In many applications, it is useful to also extract more subtle relations which are often expressed as meta statements in the text. These can, for instance provide the context for a statement (e.g., “Google acquired YouTube on October 2006”), or repercussion about a statement (e.g., “The US condemned Russia’s invasion of Georgia”). In this work, we report on a system for extracting relations expressed in both direct statements as well as in meta statements. We propose a method based on Conditional Random Fields that explores syntactic features to extract both kinds of statements seamlessly. We follow the Open Information Extraction paradigm, where a classifier is trained to recognize any type of relation instead of specific ones. Finally, our results show substantial improvements over a state-of-the-art information extraction system, both in terms of accuracy and, especially, recall.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !