Fast and Accurate k-means For Large Datasets thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Fast and Accurate k-means For Large Datasets

Published on Jan 25, 20128465 Views

Clustering is a popular problem with many applications. We consider the $k$-means problem in the situation where the data is too large to be stored in main memory and must be accessed sequentially, su

Related categories

Chapter list

Fast and Accurate k-means for Large Data Sets00:00
K-means Clustering00:17
Algorithms for solving k-means01:08
K-means for Large Datasets01:55
Streaming k-means02:08
Improvements/Contributions04:05
More relevant algorithms for streaming k-means05:17
Experimental Setup05:34
Time to Compute Solution06:11
Cost (Summed Squared)06:23
Bottleneck in Algorithm Runtime06:41
Compute Actual Distance to Those Neighbors08:03
Substantially Faster08:15
Cost change is (usually) minor08:42
Conclusion08:55
Acknowledgments09:37