event thumbnail image
International Conference on Machine Learning - Bonn 2005
Pascal

Privacy and Background Knowledge

author: Johannes Gehrke, Cornell University

Description

The digitization of our daily lives has led to an explosion in the collection of data by governments, corporations, and individuals. Protection of confidentiality of this data is of utmost importance. However, knowledge of statistical properties of private data can have significant societal benefit, for example, in decisions about the allocation of public funds based on Census data, or in the analysis of medical data from different hospitals to understand the interaction of drugs. I will start by introducing two application scenarios, privacy-preserving data analysis and privacy-preserving data publishing. I will show how in simple models background knowledge can lead to severe breaches of privacy in both applications, and I will describe how proper modeling of background knowledge can avoid privacy breaches. I will outline first algorithmic steps towards privacy-preserving data analysis and data publishing with background knowledge, and I will conclude with open problems.

You might be experiencing some problems with Your Video player.
Slides
0:00 Privacy and Background Knowledge
2:03 An Abundance of Data
2:37 Driving Factors: A LARGE Hardware Revolution
3:12 Driving Factors: A small Hardware Revolution
3:39 Other Driving Factors
4:14 picture
4:44 picture
5:18 picture
5:22 Pulsars
6:12 Pulsar Surveys
6:22 Project Requirements
6:44 Driving Factors: Analysis Capabilities
7:24 And Even the Popular Press Caught On
7:44 Concerns About Privacy
8:27 The Setup
8:44 Model I: Untrusted Data Collector
9:15 Minimal Information Sharing
9:56 Model II: Trusted Data Collector
10:54 Disclosure Limitations
11:30 Types of Disclosure
11:56 Types of Disclosure
12:27 Types of Disclosure
12:51 Talk Outline
13:48 Privacy Preserving Associations
14:00 Problem Introduction
15:01 Example
16:04 The Itemset Lattice
16:27 Frequent Itemsets
16:50 Breath First Search: 1-Itemsets
17:01 Breath First Search: 1-Itemsets
17:12 Breath First Search: 2-Itemsets
17:30 Breath First Search: 3-Itemsets
17:40 Breadth First Search: Remarks
17:53 Depth First Search (2)
17:55 Depth First Search (3)
18:07 Depth First Search (4)
18:16 Talk Outline
18:28 Our Model
18:56 Our Model (Contd.)
19:04 Our Model (Contd.)
19:07 Privacy Preserving Associations
19:29 Minimal Information Sharing
19:39 Our Model (Contd.)
19:47 Our Model (Contd.)
19:51 Our Model: Another View
20:24 The Problem
20:45 The Randomized Response Model
21:16 Another View: Two Questions
21:40 Analysis
21:55 Analysis (Contd.)
22:25 Interval Privacy
22:46 Interval Privacy: Quantifying Privacy
23:32 Talk Outline
23:45 Background Knowledge
24:21 Example: {a, b, c}
24:47 Example: {a, b, c}
24:50 Example: {a, b, c}
25:05 Example: {a, b, c}
25:35 Example: {a, b, c}
26:09 Privacy Breaches
26:50 Simple Privacy Breaches
27:46 Privacy Breaches: Goals
28:23 α-to-β Privacy Breach
28:54 α-to-β Privacy Breach
29:06 α-to-β Privacy Breach
29:28 α-to-β Privacy Breach
29:39 α-to-β Privacy Breach
30:23 Amplification Condition
30:35 Amplification Condition
30:53 Amplification Condition
31:15 Amplification Condition
31:27 Amplification Condition
31:59 The Bound on α-to-β Breaches
32:36 The Bound on α-to-β Breaches
32:47 Amplification: Summary
33:50 Talk Outline
35:42 Definition of select-a-size
35:56 Definition of select-a-size
36:05 Definition of select-a-size
36:25 Definition of select-a-size
36:45 Support Recovery
36:56 Support Recovery
37:20 Support Recovery
37:26 Support Recovery
37:32 Support Recovery
37:36 Support Recovery
37:39 The Unbiased Estimators
37:48 Apriori [AS94]
38:13 The Modified Apriori
38:27 Talk Outline
38:39 Talk Outline
39:12 Trusted Data Collector
39:25 Disclosure Limitations
39:44 Sample Microdata
40:48 Removing SSN …
41:06 Linkage Attacks
41:42 Linkage Attacks (Contd.)
42:32 Quasi-Identifiers and Sensitive Attributes
42:55 K-Anonymity [Sweeney02]
43:39 K-Anonymity
43:46 K-Anonymity Through Generalization
43:59 Example Microdata
44:08 4-Anonymous Microdata
44:39 K-Anonymity Algorithms
44:46 Talk Outline
44:55 Example Microdata
44:58 4-Anonymous Microdata
45:02 Homogeneity Attack
45:45 Background Knowledge Attack
46:58 Data Publishing Desiderata
47:32 Privacy Definition (1)
47:52 Privacy Definition (2)
48:22 Bayes-Optimal Privacy– Drawbacks
48:27 Towards A Practical Definition (1)
48:31 Towards A Practical Definition (2)
48:34 Ensuring Diversity
49:01 3-Diverse Microdata
49:28 L-Diversity Revisited
49:49 L-Diversity: Summary
49:55 What I Talked About
50:34 What I Talked About (Only Useful Stuff)
51:01 Open Problems
52:32 Modeling Belief
52:42 Thanks
53:01 But Of Course We Have More Confidence Than Scott Adams …
53:09 Questions?

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment: