An Empirical Comparison of Stream Clustering Algorithms

Analysing streaming data has received considerable attention over the recent years. A key research area in this field is stream clustering which aims to recognize patterns in a possibly unbounded data stream of varying speed and structure. Over the past decades a multitude of new stream clustering algorithms have been proposed. However, to the best of our knowledge, no rigorous analysis and comparison of the different approaches has been performed. Our paper fills this gap and provides extensive experiments for a total of ten popular algorithms. We utilize a number of standard data sets of both, real and synthetic data and identify key weaknesses and strengths of the existing algorithms.

Proceedings of the ACM International Conference on Computing Frontiers (CF '17)