
en
0.25
0.5
0.75
1.25
1.5
1.75
2
A High-Performance Multithreaded Approach For Clustering A Stream Of Documents
Published on Feb 4, 20251659 Views
We present an efficient approach for clustering a massive stream of textual documents, with particular emphasis on parallelization by the means of multithreaded processing. The underlying cluste
Related categories
Presentation
A high-performance multithreaded approach for clustering a stream of documents00:00
Motivation00:09
Architecture02:25
Underlying clustering approach06:09
Considerations for paralellization08:38
Downsides of fine-grained paralellization10:33
Avoiding cluster-level locking11:56
Read and write stages12:59
Main vs. Worker threads14:28
Barrier -based parallelization15:27
Barrier processing in the main thread17:03
Conclusions and future work19:46
Thank you21:16