Towards Hybrid NER: A Study of Content and Crowdsourcing-Related Performance Factors

Published on 2015-07-151505 Views

Oluwaseyi Feyisetan

This paper explores the factors that influence the human component in hybrid approaches to named entity recognition (NER) in microblogs, which combine state-of-the-art automatic techniques with hum

ESWC 2015 - Portorož

Related categories

Presentation

Towards hybrid NER: a study of content and crowdsourcing-related performance factors 00:00

Motivation00:23

Overview: Named entity recognition in tweets00:55

Aims of this work 02:33

Research hypotheses 03:07

Task design04:13

Platform: Wordsmith05:02

Experiment06:22

Datasets08:29

Gold standard: entity definition 09:14

Gold standard: entity type mapping 09:58

Results10:37

H1.1 Number of entities 10:48

H1.2 Micropost length 11:18

H1.3 Entity types 11:35

H2.1 Skipped tweets: number of entities 12:24

H2.1 Skipped tweets: Micropost length 12:46

H2.1 Skipped tweets: Entity types 13:13

H2.2 Avg. accurate annotation time (secs)13:50

H2.3 Accuracy of annotation 15:36

Discussion: Difficult cases16:06

Discussions: Implicit entities 18:55

Summary20:36

Thank you22:13