VoldemortKG: Mapping Schema.org Entities to Linked Open Data
published: Nov. 10, 2016, recorded: October 2016, views: 40
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Increasingly, webpages mix entities coming from various sources and represented in different ways. It can thus happen that the same entity is both described by using schema.org annotations and by creating a text anchor pointing to its Wikipedia page. Often, those representations provide complementary information which is not exploited since those entities are disjoint. We explored the extent to which entities represented in different ways repeat on the Web, how they are related, and how they complement (or link) to each other. Our initial experiments showed that we can unveil a previously unexploited knowledge graph by applying simple instance matching techniques on a large collection of schema.org annotations and DBpedia. The resulting knowledge graph aggregates entities (often tail entities) scattered across several webpages, and complements existing DBpedia entities with new facts and properties. In order to facilitate further investigation in how to mine such information, we are releasing i) an excerpt of all Common Crawl webpages containing both Wikipedia and schema.org annotations, ii) the toolset to extract this information and perform knowledge graph construction and mapping onto DBpedia, as well as iii) the resulting knowledge graph (VoldemortKG) obtained via label matching techniques.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !