video thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

A High-Performance Multithreaded Approach For Clustering A Stream Of Documents

Published on Feb 4, 20251659 Views

We present an efficient approach for clustering a massive stream of textual documents, with particular emphasis on parallelization by the means of multithreaded processing. The underlying cluste

Related categories

Presentation

A high-performance multithreaded approach for clustering a stream of documents00:00
Motivation00:09
Architecture02:25
Underlying clustering approach06:09
Considerations for paralellization08:38
Downsides of fine-grained paralellization10:33
Avoiding cluster-level locking11:56
Read and write stages12:59
Main vs. Worker threads14:28
Barrier -based parallelization15:27
Barrier processing in the main thread17:03
Conclusions and future work19:46
Thank you21:16