[SOLVED] CSC3022H-Lab 2 K-Means Clustering

30.00 $

Category:

Description

Rate this product

Problem Description

Implement (in C++) the K-means clustering algorithm [MacQueen, 1967] with a Euclidean distance metric. See online tutorials at:

  • http://www.saedsayad.com/clustering kmeans.htm

Use the implemented K-means algorithm to cluster the following 8 examples (table 1) into 3 clusters.

When running K-means, set the initial seeds (initial centroid of each cluster) as examples 1, 4 and 7.

Table 1: Data (examples have two attributes: X, Y , both in range: [1, 10]).

Example Number X Y
1 2 10
2 2 5
3 8 4
4 5 8
5 7 5
6 6 4
7 1 2
8 4 9

Question 1: How many iterations are needed for k-means to converge?

In a text file output the results of each iteration (for each cluster, list the examples that fall into each cluster), and the centroids of each cluster, e.g.:

Iteration 1

Cluster 1: 1, 2, 3

Centroid: (3.0, 9.5)

Cluster 2: 4, 5, 6

Centroid: (6.5, 5.25)

Cluster 3: 7, 8

Centroid: (1.5, 3.5)

  • ··

Iteration N

Cluster 1: 8, 7, 6

Centroid: (1.5, 3.5)

Cluster 2: 5, 4, 3

Centroid: (6.5, 5.25)

Cluster 3: 2, 1

Centroid: (3.0, 9.5)

In a ZIP file, place the source code, makefile, and the output text file (answer to question 1). Upload the ZIP file

References

[MacQueen, 1967] MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the Berkeley Symposium on Mathematics, Statistics and Probability, pages 281–297, Berkeley, USA. University of California Press.