WIKImage: Correlated image and text datasets

Published on 2011-11-043366 Views

Doni Pracner

This paper presents work towards the creation of free and redistributable datasets of correlated images and text. Collections of free images and related text were extracted from Wikipedia with our new

SiKDD 2011 - Ljubljana

Related categories

Presentation

WIKImage: Correlated Image and Text Datasets00:00

Presentation outline00:20

The Problem: Dataset requirements00:51

Dataset requirements00:53

Working around copyright01:55

WIKImage: Mediawiki api03:10

Mediawiki api03:16

API options04:16

WIKImage: Gathering data04:58

WIKImage05:00

Gathering data - 105:14

Gathering data - 206:44

Connecting images and text07:45

WIKImage: WIKImage dataset browser09:22

WIKImage dataset browser09:31

WIKImage dataset labeler10:02

Experiments: Dataset d1 and the feature representation11:44

Dataset d111:51

Distribution of labels in the dataset13:09

Feature representation13:52

Experiments: Estimateing the “difficulty” of the dataset15:04

Experiments on the dataset15:11

Overview of the results16:10

Averaged results for SVM and 1NN (cosine)17:51

SVM – F-measure results18:21

1NN (cosine) – F-measure results18:52

Summary: Results and open questions19:01

Summary19:03

Future Work19:24

Thank you for your attention21:29