Detecting Erroneous Identity Links on the Web using Network Metrics

Published on 2018-11-222881 Views

Joe Raad

Although best practices for publishing Linked Data encourage the re-use of existing IRIs, multiple names are often used to denote the same thing. Whenever multiple names are used, owl:sameAs statement

17th International Semantic Web Conference (ISWC), Monterey 2018

Related categories

Presentation

Detecting erroneous identity links in the web of data00:00

Motivation00:08

'identity' links in the lod00:41

"The sameas problem" (1/2)00:52

"The sameas problem" (2/2)02:04

How can we detect erroneous sameas links?02:47

Requirements03:48

Approach & Experiments04:30

Overall idea04:38

Dataset05:29

1. Extract the explicit identity statements05:37

2. Partition to equality sets05:52

'Barack Obama' equality set06:09

3. Detect the community structure in each eq set06:32

4. Assign error degrees07:19

Communities - 'Barack Obama'07:59

Evaluation08:46

Error degree distribution of 556m owl:sameas08:49

Objectives09:04

Finding the threshold (1/3)09:28

Finding the threshold (2/3)10:26

Evaluation / Accuracy (1/2)12:09

Finding the threshold (3/3)12:54

Evaluation / Accuracy (2/2)13:09

Evaluation / Recall13:55

Who is messing up the lod? - 114:49

Who is messing up the lod? - 215:12

Who is messing up the lod? - 315:43

Conclusion & perspectives16:49

summary16:50

Perspectives17:59

Thank you!18:22