en
0.25
0.5
0.75
1.25
1.5
1.75
2
A High-Performance Multithreaded Approach For Clustering A Stream Of Documents
Published on Dec 01, 20141654 Views
We present an efficient approach for clustering a massive stream of textual documents, with particular emphasis on parallelization by the means of multithreaded processing. The underlying cluste
Related categories
Chapter list
A high-performance multithreaded approach for clustering a stream of documents00:00
Motivation00:09
Architecture02:25
Underlying clustering approach03:29
Underlying clustering approach04:28
Underlying clustering approach05:09
Underlying clustering approach05:42
Underlying clustering approach06:09
Underlying clustering approach06:33
Underlying clustering approach07:53
Considerations for paralellization08:38
Considerations for paralellization09:27
Downsides of fine-grained paralellization10:33
Avoiding cluster-level locking11:56
Read and write stages12:59
Main vs. Worker threads14:28
Barrier -based parallelization15:27
Barrier processing in the main thread17:03
Conclusions and future work19:46
Thank you21:16