en-de

en-es

en-fr

en-sl

en

en-zh

0.25

0.5

0.75

1.25

1.5

1.75

2

# Fast and Accurate k-means For Large Datasets

Published on Jan 25, 20128463 Views

Clustering is a popular problem with many applications. We consider the $k$-means problem in the situation where the data is too large to be stored in main memory and must be accessed sequentially, su

#### Related categories

#### Chapter list

Fast and Accurate k-means for Large Data Sets00:00

K-means Clustering00:17

Algorithms for solving k-means01:08

K-means for Large Datasets01:55

Streaming k-means02:08

Improvements/Contributions04:05

More relevant algorithms for streaming k-means05:17

Experimental Setup05:34

Time to Compute Solution06:11

Cost (Summed Squared)06:23

Bottleneck in Algorithm Runtime06:41

Compute Actual Distance to Those Neighbors08:03

Substantially Faster08:15

Cost change is (usually) minor08:42

Conclusion08:55

Acknowledgments09:37