Integrating Spreadsheet Data via Accurate and Low-Effort Extraction

Published on 2014-10-082613 Views

Zhe (Shirley) Chen

Spreadsheets contain valuable data on many topics. However, spreadsheets are difficult to integrate with other data sources. Converting spreadsheet data to the relational model would allow data analys

Research Sessions

Related categories

Presentation

Integrating Spreadsheet Data via Accurate and Low-Effort Extraction00:00

Spreadsheets are Everywhere - 100:06

Spreadsheets are Everywhere - 200:22

What is the strength of the connection between smoking and lung cancer for the 50 U.S. states ?00:35

OUR GOAL: Integration - 100:55

OUR GOAL: Integration - 201:06

OUR GOAL: Integration - 301:19

Challenges: Implicit Structures01:35

Implicit Mapping Structures - 101:50

Implicit Mapping Structures - 202:28

Implicit Mapping Structures - 302:34

Implicit Mapping Structures - 402:39

Spreadsheet Relational Table02:41

Hierarchical Spreadsheets are Popular03:05

Outline - 103:33

Problem Definition - 103:37

Problem Definition - 203:51

Problem Definition - 303:55

Problem Definition - 403:57

Problem Definition - 504:01

Problem Definition - 604:26

Hierarchy Extraction is Challenging - 104:35

Hierarchy Extraction is Challenging - 205:07

Outline - 205:17

User Workflow - 105:23

Phase 1: Interface 1805:34

User Workflow - 205:56

Phase 2: A Repair Operation06:41

An Undirected Graphical Model Based Approach07:24

The Hierarchy Extraction Problem - 107:27

The Hierarchy Extraction Problem - 207:49

The Hierarchy Extraction Problem - 307:52

The Hierarchy Extraction Problem - 407:57

Building the Graphical Model - 108:07

Building the Graphical Model - 208:25

Node Potentials08:37

Edge Potentials - 108:46

Edge Potentials - 208:57

Global Potentials - 109:23

Global Potentials - 209:29

Global Potentials - 309:37

Encoding Interactive Repair - 109:46

Encoding Interactive Repair - 209:57

Encoding Interactive Repair - 310:02

Encoding Interactive Repair - 410:20

Encoding Interactive Repair - 510:38

Outline - 310:57

Experiment Setup11:01

Automatic Extraction Evaluation - 111:19

Automatic Extraction Evaluation - 211:40

Automatic Extraction Evaluation - 312:05

Interactive Repair Evaluation - 112:21

Interactive Repair Evaluation - 212:38

Interactive Repair Evaluation - 313:05

Outline - 413:23

Conclusions13:25

Q & A13:46