## Lecture 8: Universal Hashing, Perfect Hashing

author: Charles E. Leiserson, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, MIT
recorded by: Massachusetts Institute of Technology, MIT
published: Feb. 10, 2009,   recorded: October 2005,   views: 39193
released under terms of: Creative Commons Attribution Non-Commercial Share Alike (CC-BY-NC-SA)

# Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

# Description

"Hashing. Today we're going to do some amazing stuff with hashing. And, really, this is such neat stuff, it's amazing. We're going to start by addressing a fundamental weakness of hashing. And that is that for any choice of hash function There exists a bad set of keys that all hash to the same slot. OK. So you pick a hash function. We looked at some that seem to work well in practice, that are easy to put into your code. But whichever one you pick, there's always some bad set of keys. So you can imagine, just to drive this point home a little bit..."

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

1 Nodir, May 7, 2009 at 3:45 p.m.:

Cool lecture.
But I have 2 points which I would want to understand:
- the constructed set of universal hash functions do not seem to be easily computable (namely, in O(1) time) or is it?
- in the proof of universality of dot-product hash functions: what happens when x=<x0,...,xr> and y=<y0,...,yr> differ in more than 1 digit?

2 RUBEN BUABA, June 18, 2009 at 12:10 a.m.:

Wow, this is an excellent lecture.
Can I have more of your lectures on LSH?
Thanks.

Ruben Buaba
NCA&T State University

3 LK, December 8, 2011 at 10:17 a.m.:

@Nodir:
- if the prime m is large, i.e. proportional to the size of the universe, then we can compute h in O(1), assuming that we can do constant size operations with such large numbers, which is reasonable. The cost is log_m(u).
- in the proof it is not assumed that x and y differs in exactly 1 digit, we only need that they differ in at least one digit and w.l.o.g. it is assumed that at index 0 they differ.

4 Gabriel, December 20, 2012 at 3:30 a.m.:

I don't understand how the cardinality of a set can be a fraction, at the Universal hashing definition. Is it implying |H| = km ?

5 Miroslav Chomut, April 21, 2014 at 3:14 p.m.:

Slide "Perfect Hashing" seems to contain error,
namely
h(14)=h(27)=1
h_31(14)=1
h_31(27)=2

is compressed (lossy compression) into
h_31(14)=h_31(27)=1

which is inconsistent with image, and would cause collision

6 Tomas Novella, August 21, 2016 at 2:40 p.m.:

There's an error on the intro slide of perfect hashing. It reads h_31(14) = 1 = h_31(27) which is not true because the latter equals 2.
However the version on the blackboard is correct.