[SOLVED] COEN 140 - Lab5

35.00 $

Category:

Description

Rate this product

 Machine Learning and Data Mining

  K-Means Clustering

 Problem The Iris dataset contains 150 data samples of three Iris categories, labeled by outcome values 0, 1, and 2. Each data sample has four attributes: sepal length, sepal width, petal length, and petal width.

 

Here is the code snippet to load the dataset. Note: since K-means clustering is unsupervised learning, we don’t need to split the data into a training set and a test set.

 

from sklearn import datasets iris = datasets.load_iris() print(list(iris.keys()))

print(iris.feature_names)

 

X = iris.data # each row is a sample y=iris.target # target labels

 

Implement the K-means clustering algorithm to group the samples into K=3 clusters. Initialize the cluster centers by the first 3 data samples. The objective function to minimize is defined as: 𝐽 =

𝑁𝑛=1𝐾𝑘=1𝑟𝑘𝑛‖𝐦𝑘 −𝐱𝑛22. Each iteration includes an assignment step and a cluster-center update step.

 

Calculate the objective function value 𝐽 after the assignment step in each iteration. Exit the iterations if the following criterion is met: 𝐽(Iter−1)−𝐽(Iter) < ε, where ε = 10−5, and Iter is the iteration number. Plot the objective function value 𝐽 versus the iteration number Iter. Comment on the result.