Distributed Web Search thumbnail
slide-image
Pause
Mute
Subtitles not available
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Distributed Web Search

Published on Sep 20, 20115446 Views

Related categories

Chapter list

(4) Crawling00:00
Crawling00:18
Crawling Goals (1)00:55
Crawling Goals (2)02:24
Crawling Goals (3)03:56
Software Architecture (1)06:34
Software Architecture (2)06:37
Basic Crawl Architecture08:53
Priority Queues09:29
Formal Problem10:10
Crawling Heuristics10:51
Comparing crawling algorithms14:45
No Historical Information15:47
Historical Information17:42
Validation in the Greek domain18:02
(5) Final Remarks18:44
Young research field18:47
Web Design and Search19:03
Main Open Problems21:09
The New frontiers21:55
What’s next?23:25
Bibliography – General23:53
Distributed Web Search27:42
Agenda28:06
A Typical Web Search Engine28:18
Search Engine Architectures28:30
Related Distributed Search Architectures29:40
System Size33:02
Questions35:33
Advantages36:14
Challenges37:00
Crawling37:29
Too Many Factors37:57
Experimental Setup38:20
Experimental Results (1)38:49
Experimental Results (2)39:51
Impact of Distributed Web Crawling on Relevance39:54
Impact of Download Speed40:18
Impact of Crawling Order40:53
Impact of Region Boosting41:27
Search Relevance41:28
Indexing42:24
Query Processing: Pipelining43:33
Query Processing: Round Robin43:34
Caching basics43:37
Caching44:34
Caching in Web Search Engines45:21
Caching at work45:55
Data Characterization46:25
Caching Query Results or Term Postings?47:15
Static Caching of Postings48:23
Evaluating Caching of Postings49:08
Results49:39
Combining caches of query results and term postings50:06
Experimental Setting51:03
Parameter Estimation51:13
Parameter Values51:15
Centralized System Simulation51:16
WAN System Simulation52:05
Query Dynamics52:20
Why caching results can’t reach high hit rates53:17
Benefits of filtering out infrequent queries54:19
Admission Controlled Cache (AC)54:26
Why an uncontrolled cache?55:08
Features for admission policy55:08
Evaluation55:46
Results for Stateful Features56:10
Results for Stateless features56:46
Index Pruning56:47
All queries vs. Misses: Number of terms in a query57:31
All queries vs. Misses: Query result size distribution58:22
All queries vs. Misses: Term popularity distribution58:26
Static Index Pruning59:41
Analysis of Results01:00:03
Locality01:00:29
Tier Prediction01:01:14
Motivation: Centralized Systems01:01:58
Corpus01:02:41
Trade-off Analysis01:05:04
Experimental Results01:08:28
Tier Prediction Example01:09:18
Star Topology01:10:04
Multi-site Web Search Architecture01:12:24
A Search Engine Architecture with Partial Index Replication and Query Forwarding01:12:27
Cost Model01:12:31
Optimal Number of Sites01:13:04
Query Processing01:14:13
Query Processing Results01:15:19
Cost Model Instantiation01:16:04
Improved Query Forwarding01:16:32
Experimental Setup01:16:55
Locality of Queries01:16:57
Performance of the Algorithm (1)01:17:20
Conclusions01:17:44
Thank you!01:18:37