Exploring Importance Measures for Summarizing RDF/S KBs
published: July 10, 2017, recorded: June 2017, views: 778
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Given the explosive growth in the size and the complexity of the Data Web, there is now more than ever, an increasing need to develop methods and tools in order to facilitate the understanding and exploration of RDF/S Knowledge Bases (KBs). To this direction, summarization approaches try to produce an abridged version of the original data source, highlighting the most representative concepts. Central questions to summarization are: how to identify the most important nodes and then how to link them in order to produce a valid sub-schema graph. In this paper, we try to answer the first question by revisiting six well-known measures from graph theory and adapting them for RDF/S KBs. Then, we proceed further to model the problem of linking those nodes as a graph Steiner-Tree problem (GSTP) employing approximations and heuristics to speed up the execution of the respective algorithms. The performed experiments show the added value of our approach since (a) our adaptations outperform current state of the art measures for selecting the most important nodes and (b) the constructed summary has a better quality in terms of the additional nodes introduced to the generated summary.
Download slides: eswc2017_kondylakis_importance_measures_01.pdf (3.6 MB)
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !