Harvesting, Searching, and Ranking Knowledge from the Web

author:Gerhard Weikum, Max Planck Institute for Informatics, Max Planck Institute
published: March 12, 2009,   recorded: February 2009,   views: 195
You might be experiencing some problems with Your Video player.

Related content

Visitors who watched this lecture also watched...
01:05:00
Challenges in Building Large-Scale Information Retrieval Systems

4065 views - Jeffrey Dean, 2009
14:19
Time Will Tell: Leveraging Temporal Expressions in IR

175 views - Irem Arikan, Srikanta Bedathur, Klaus Berberich, 2009
28:57
SOFIE: Self-Organizing Flexible Information Extraction

120 views - Gerhard Weikum, Fabian M. Suchanek, Mauro Sozio, 2009
54:27
Efficient Top-k Queries for XML Information Retrieval

281 views - Gerhard Weikum, 2006
05:19
Welcome and Introduction to WSDM 2009

88 views - Ricardo Baeza-Yates, 2009
25:04
Diversifying Search Results

117 views - Rakesh Agrawal, Sreenivas Gollapudi, Alan Halverson, Samuel Ieong, 2009
23:12
Discovering and Using Groups to Improve Personalized Search

117 views - Jaime Teevan, Meredith Ringel Morris, Steve Bush, 2009
25:58
Aggregation of News Content Into Web Results

73 views - Fernando Diaz, 2009
30:43
Clustering the Tagged Web

180 views - Hector Garcia-Molina, Paul Heymann, Daniel Ramage, Christopher D. Manning, 2009
35:10
Query by Document

100 views - David Yin Yang, Nilesh Bansal, Wisam Dakka, Panagiotis G. Ipeirotis, Nick Koudas, Dimitris Papadias, 2009

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.

Description

There are major trends to advance the functionality of search engines to a more expressive semantic level. This is enabled by employing large-scale information extraction of entities and relationships from semistructured as well as natural-language Web sources. In addition, harnessing Semantic-Web-style ontologies and reaching into Deep-Web sources can contribute towards a grand vision of turning the Web into a comprehensive knowledge base that can be efficiently searched with high precision.

This talk presents ongoing research towards this objective, with emphasis on our work on the YAGO knowledge base and the NAGA search engine but also covering related projects. YAGO is a large collection of entities and relational facts that are harvested from Wikipedia and WordNet with high accuracy and reconciled into a consistent RDF-style "semantic" graph. For further growing YAGO from Web sources while retaining its high quality, pattern-based extraction is combined with logic-based consistency checking in a unified framework. NAGA provides graph-template-based search over this data, with powerful ranking capabilities based on a statistical language model for graphs. Advanced queries and the need for ranking approximate matches pose efficiency and scalability challenges that are addressed by algorithmic and indexing techniques.

YAGO is publicly available and has been imported into various other knowledge-management projects including DBpedia. YAGO shares many of its goals and methodologies with parallel projects along related lines. These include Avatar, Cimple/DBlife, DBpedia, KnowItAll/TextRunner, Kylin/KOG, and the Libra technology (and more). Together they form an exciting trend towards providing comprehensive knowledge bases with semantic search capabilities.

Link this page  

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: