Experiences with the Nutch search engine thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Experiences with the Nutch search engine

Published on Feb 25, 200717938 Views

Nutch is open-source software that implements a web search engine. It has been used in a variety of applications: vertical search engines, archival web search, search engines that incorporate novel me

Related categories

Chapter list

Open Source Platforms for Search00:00
What am I?01:33
What Distinguishes Open Source?02:56
Lucene pre-history: Xerox PARC07:37
Lucene pre-History: Apple ATG08:54
Lucene pre-History: Excite10:51
Digression: Seek versus Transfer pt 114:41
Digression: Seek versus Transfer pt 216:54
Lucene History20:26
Original Lucene Goals23:18
Lucene Architecture24:38
Lucene Indexing Algorithm25:16
Lucene Indexing Algorithm: notes28:18
Lucene Search Algorithms28:27
Lucene Status29:47
Rapid Adoption Facilitators32:09
Lucene Future35:18
Nutch37:28
Nutch Documents39:25
Nutch Queries41:20
Query Parsing44:33
Nutch Search Performance Tricks47:47
Nutch Scalability Goals50:16
Scalability51:16
Initial Scalability52:50
... but not to billions of pages53:04
Hadoop53:42
Hadoop's DFS54:14
MapReduce56:16
MapReduce job processing59:37
Hadoop Status01:00:15
Nutch on Hadoop01:01:37
Nutch Status01:02:33
Nutch Future01:03:52
Apache is Community01:04:26
Thanks!01:06:09