## Lecture 8: Universal Hashing, Perfect Hashing

author: Charles E. Leiserson, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, MIT
recorded by: Massachusetts Institute of Technology, MIT
published: Feb. 10, 2009,   recorded: October 2005,   views: 6965
released under terms of: CC BY-NC-SA
You might be experiencing some problems with Your Video player.

# Slides

0:00 Slides Introduction to Algorithms - Lecture 8 A weakness of hashing Universal hashing Universality is good - Theorem Proof of theorem (1) Proof of theorem (2) Proof of theorem (3) Proof of theorem (4) Proof of theorem (5) Constructing a set of universal hash functions Universality of dot-product hash functions - Theorem Proof (continued) Fact from number theory Back to the proof Proof (completed) Perfect hashing Collisions at level 2 No collisions at level 2 Analysis of storage

# Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

# Description

"Hashing. Today we're going to do some amazing stuff with hashing. And, really, this is such neat stuff, it's amazing. We're going to start by addressing a fundamental weakness of hashing. And that is that for any choice of hash function There exists a bad set of keys that all hash to the same slot. OK. So you pick a hash function. We looked at some that seem to work well in practice, that are easy to put into your code. But whichever one you pick, there's always some bad set of keys. So you can imagine, just to drive this point home a little bit..."

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

1 Nodir, May 7, 2009 at 3:45 p.m.:

Cool lecture.
But I have 2 points which I would want to understand:
- the constructed set of universal hash functions do not seem to be easily computable (namely, in O(1) time) or is it?
- in the proof of universality of dot-product hash functions: what happens when x=<x0,...,xr> and y=<y0,...,yr> differ in more than 1 digit?

2 RUBEN BUABA, June 18, 2009 at 12:10 a.m.:

Wow, this is an excellent lecture.
Can I have more of your lectures on LSH?
Thanks.

Ruben Buaba
NCA&T State University

3 LK, December 8, 2011 at 10:17 a.m.:

@Nodir:
- if the prime m is large, i.e. proportional to the size of the universe, then we can compute h in O(1), assuming that we can do constant size operations with such large numbers, which is reasonable. The cost is log_m(u).
- in the proof it is not assumed that x and y differs in exactly 1 digit, we only need that they differ in at least one digit and w.l.o.g. it is assumed that at index 0 they differ.

4 Gabriel, December 20, 2012 at 3:30 a.m.:

I don't understand how the cardinality of a set can be a fraction, at the Universal hashing definition. Is it implying |H| = km ?