The "Real World" Web Search Problem
Description
There are numerous papers which present methods to address web-search related
challenges such as relevance and ranking, query processing, and classication.
Unfortunately, many of these methods are ineective in a large-scale commer-
cial setting, despite statistically signicant experimental results. To help bridge
this gap between academic and commercial settings, this lecture examines the
components of large-scale commercial search engines, then proposes ve classes
of problems encountered by researchers in this area - biases; bad or dierent
assumptions about statistics, users, queries or web contents; insucient or miss-
ing data; inconsistencies related to evaluations and objectives; and policies or
external factors, including resource limitations. Using real stories and personal
experiences, the lecture illustrates examples of these problems, along with a few
proposed approaches to deal with or reduce their consequences or eects.
In addition to the classes of problems, there are several fundamental prop-
erties of the web that are often not considered suciently when performing
experiments or dening problems, resulting in unrealistic experiments or ob-
jectives. Even within a search engine, overlooking key properties such as the
non-stationarity of the users and the web, can result in ineective evaluations,
and may even lead to failed subsystems.
Fortunately, very simple approaches can often be highly eective. This lec-
ture helps put context on how commercial search engines work, what problems
they face, what eective solutions require, and how evaluations and problem
denitions could be changed to more eectively predict success in a commercial
setting - while still retaining interest of researchers.
| Slides | |
| 0:00 | The Real World Web Search Problem: Bridging The Gap Between Academic and Commercial Understanding of Issues and Methods |
| 0:44 | Overview and Objectives |
| 1:51 | Commercial Plug |
| 2:36 | About the Speaker: Dr. Eric Glover |
| 3:37 | A True Story |
| 4:32 | Talk Flow - Part 1 - theoretical search |
| 4:51 | What is a Web Search Engine? (1) |
| 5:30 | What is a Web Search Engine? (2) |
| 5:46 | What is a Web Search Engine? (3) |
| 6:33 | What is a Web Search Engine? (4) |
| 7:16 | Search Engine Theory - Crawler |
| 8:16 | Search Engine Theory - Indexer |
| 9:25 | Search Engine Theory - Relevant Set |
| 12:25 | Search Engine Theory - Ranking (Theory) |
| 14:09 | What is missing (theory)? |
| 15:55 | What is a Web Search Engine? |
| 18:20 | Good References |
| 19:28 | Part II - Theory Gets Disconnected |
| 21:21 | Commercial Search Engine != 10 Blue links |
| 22:40 | Important Properties Of Commercial Web Search |
| 28:33 | Separating Commercial Web Search from Theory |
| 30:28 | Simple Theory vs Cold Reality - Crawling (1) |
| 35:15 | Simple Theory vs Cold Reality - Crawling (2) |
| 40:02 | Simple Theory vs Cold Reality - Indexing (1) |
| 46:24 | Simple Theory vs Cold Reality - Indexing (2) |
| 47:32 | Simple Theory vs Cold Reality - Query Processing |
| 53:13 | Relevance - Theory |
| 54:20 | Relevance - Theory Problem 1: Duplicates |
| 57:40 | Relevance - Theory Problem 2: Marginal Value |
| 59:33 | Relevance - Theory Problem 3: UI |
| 64:14 | Relevance - Theory != Reality |
| 66:37 | Relevance - Considerations |
| 67:51 | Relevance - Current Approximations |
| 69:09 | Relevance: Academic Measures (1) |
| 69:38 | Relevance: Academic Measures (2) |
| 70:05 | Relevance - How to Evaluate (data) |
| 71:06 | “Relevance” - How to Evaluate |
| 71:38 | “Relevance” - Concerns |
| 74:26 | Relevance - What are the goals? |
| 74:34 | Relevance - Challenges |
| 74:56 | And the lecture moves on... |
| 75:16 | Problems Facing Researchers (an example) |
| 76:13 | Problems Facing Researchers (example cont) |
| 77:27 | STATISTICS STATISTICS STATISTICS |
| 79:14 | Five Important Classes of Problems Faced by Many Researchers |
| 81:05 | Biases (1) |
| 85:06 | Biases (2) |
| 86:04 | Problem: Assumptions (Statistics) |
| 87:42 | Problem: Assumptions |
| 90:27 | Data (insufficient or missing) |
| 91:44 | (Inconsistent) Evaluations and Objectives |
| 92:33 | Evaluation Example |
| 94:27 | Policies and External Factors |
| 94:35 | Dealing With Problems/Challenges |
| 96:46 | Dealing With Problems an Example |
| 98:43 | How To Improve Things |
| 99:07 | Evaluation Example |
| 99:20 | How To Improve Things |
| 103:27 | How To Improve Things (cont) |
| 104:49 | How To Improve Things - cont...(1) |
| 105:50 | How To Improve Things - cont... (2) |
| 107:49 | General Advice |
| 108:23 | Commercial Plug |
| 111:17 | - Questions |
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Related content
SEE ALSO:
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !




