Generalized Markov Chain Monte Carlo Initialization for Clustering Gaussian Mixtures Using K-means

Main Article Content

Ritu Rajawat, Iti Sharma

Abstract

Gaussian mixtures are considered to be a good estimate of real life data. Any clustering algorithm that can efficiently cluster such mixtures is expected to work well in practical applications dealing with real life data. K-means is popular for such applications given its ease of implementation and scalability; yet it suffers from the plague of poor seeding. Moreover, if the Gaussian mixture has overlapping clusters, k-means is not able to separate them if initial conditions are not good. Kmeans++ is a good seeding method with high time complexity. It can be made fast by using Markov chain Monte Carlo sampling. This paper proposes a method that improves seed quality and retains speed of sampling technique. The desired effects are demonstrated on several Gaussian mixtures.

Article Details

How to Cite
, R. R. I. S. “Generalized Markov Chain Monte Carlo Initialization for Clustering Gaussian Mixtures Using K-Means”. International Journal on Recent and Innovation Trends in Computing and Communication, vol. 6, no. 5, May 2018, pp. 89-93, doi:10.17762/ijritcc.v6i5.1582.
Section
Articles