A High-Performance Multithreaded Approach For Clustering A Stream Of Documents thumbnail
slide-image
Pause
Mute
Subtitles not available
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

A High-Performance Multithreaded Approach For Clustering A Stream Of Documents

Published on Dec 01, 20141652 Views

We present an efficient approach for clustering a massive stream of textual documents, with particular emphasis on parallelization by the means of multithreaded processing. The underlying cluste

Related categories

Chapter list

A high-performance multithreaded approach for clustering a stream of documents00:00
Motivation00:09
Architecture02:25
Underlying clustering approach03:29
Underlying clustering approach04:28
Underlying clustering approach05:09
Underlying clustering approach05:42
Underlying clustering approach06:09
Underlying clustering approach06:33
Underlying clustering approach07:53
Considerations for paralellization08:38
Considerations for paralellization09:27
Downsides of fine-grained paralellization10:33
Avoiding cluster-level locking11:56
Read and write stages12:59
Main vs. Worker threads14:28
Barrier -based parallelization15:27
Barrier processing in the main thread17:03
Conclusions and future work19:46
Thank you21:16