Privacy and Background Knowledge

Published on 2007-02-253940 Views

Johannes Gehrke

The digitization of our daily lives has led to an explosion in the collection of data by governments, corporations, and individuals. Protection of confidentiality of this data is of utmost importance.

ICML 2005 - Bonn

Presentation

Privacy and Background Knowledge00:00

An Abundance of Data02:03

Driving Factors: A LARGE Hardware Revolution02:37

Driving Factors: A small Hardware Revolution03:12

Other Driving Factors03:39

picture04:44

Pulsars05:22

Pulsar Surveys06:12

Project Requirements06:22

Driving Factors: Analysis Capabilities06:44

And Even the Popular Press Caught On07:24

Concerns About Privacy07:44

The Setup08:27

Model I: Untrusted Data Collector08:44

Minimal Information Sharing09:15

Model II: Trusted Data Collector09:56

Disclosure Limitations10:54

Types of Disclosure11:30

Talk Outline12:51

Privacy Preserving Associations13:48

Problem Introduction14:00

Example15:01

The Itemset Lattice16:04

Frequent Itemsets16:27

Breath First Search: 1-Itemsets16:50

Breath First Search: 2-Itemsets17:12

Breath First Search: 3-Itemsets17:30

Breadth First Search: Remarks17:40

Depth First Search (2)17:53

Depth First Search (3)17:55

Depth First Search (4)18:07

Our Model18:28

Our Model (Contd.)18:56

Our Model: Another View19:51

The Problem20:24

The Randomized Response Model20:45

Another View: Two Questions21:16

Analysis21:40

Analysis (Contd.)21:55

Interval Privacy22:25

Interval Privacy: Quantifying Privacy22:46

Background Knowledge23:45

Example: {a, b, c}24:21

Privacy Breaches26:09

Simple Privacy Breaches26:50

Privacy Breaches: Goals27:46

α-to-β Privacy Breach28:23

Amplification Condition30:23

The Bound on α-to-β Breaches31:59

Amplification: Summary32:47

Definition of select-a-size35:42

Support Recovery36:45

The Unbiased Estimators37:39

Apriori [AS94]37:48

The Modified Apriori38:13

Trusted Data Collector39:12

Sample Microdata39:44

Removing SSN …40:48

Linkage Attacks41:06

Linkage Attacks (Contd.)41:42

Quasi-Identifiers and Sensitive Attributes42:32

K-Anonymity [Sweeney02]42:55

K-Anonymity43:39

K-Anonymity Through Generalization43:46

Example Microdata43:59

4-Anonymous Microdata44:08

K-Anonymity Algorithms44:39

Homogeneity Attack45:02

Background Knowledge Attack45:45

Data Publishing Desiderata46:58

Privacy Definition (1)47:32

Privacy Definition (2)47:52

Bayes-Optimal Privacy– Drawbacks48:22

Towards A Practical Definition (1)48:27

Towards A Practical Definition (2)48:31

Ensuring Diversity48:34

3-Diverse Microdata49:01

L-Diversity Revisited49:28

L-Diversity: Summary49:49

What I Talked About49:55

What I Talked About (Only Useful Stuff)50:34

Open Problems51:01

Modeling Belief52:32

Thanks52:42

But Of Course We Have More Confidence Than Scott Adams …53:01

Questions?53:09