Challenges in Building Large-Scale Information Retrieval Systems

author: Jeffrey Dean, Google, Inc.
published: March 12, 2009,   recorded: February 2009,   views: 82565


Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.


Building and operating large-scale information retrieval systems used by hundreds of millions of people around the world provides a number of interesting challenges. Designing such systems requires making complex design tradeoffs in a number of dimensions, including (a) the number of user queries that must be handled per second and the response latency to these requests, (b) the number and size of various corpora that are searched, (c) the latency and frequency with which documents are updated or added to the corpora, and (d) the quality and cost of the ranking algorithms that are used for retrieval.

In this talk I'll discuss the evolution of Google's hardware infrastructure and information retrieval systems and some of the design challenges that arise from ever-increasing demands in all of these dimensions. I'll also describe how we use various pieces of distributed systems infrastructure when building these retrieval systems.

Finally, I'll describe some future challenges and open research problems in this area.

See Also:

Download slides icon Download slides: wsdm09_dean_cblirs_01.pdf (2.5 MB)

Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Reviews and comments:

Comment1 samiul jahan, October 7, 2009 at 5:27 p.m.:

This lecture has upgraded my knowledge about search engines :)

Comment2 azam, November 8, 2009 at 9:58 a.m.:

من علاقمند به فعالیت در زمینه هوش مصنوعی هستم

Comment3 emmasmith Smith, July 16, 2019 at 10:27 p.m.:

Thank you so much for this. I was into this issue and tired to tinker around to check if its possible but couldnt get it done. Now that i have seen the way you did it, thanks guys

Comment4 pinoy channel ko, October 8, 2019 at 1:11 p.m.:

That's interesting and special one for me. Thanks for share this..!

Comment5 john, October 8, 2019 at 1:12 p.m.:

Do you really think this will be working fine for all of us. That's amazing if this work. provide you all the replays which you will be get online.

Comment6 nomi, December 17, 2019 at 3:44 p.m.:


Write your own review or comment:

make sure you have javascript enabled or clear this field: