Description
Introduction to Machine Learning – Decision Trees, SVM version 1
1 Introduction
In this assignment you have two tasks. They are on decision trees and support vector machines (SVM). These tasks are independent from each other and have smaller related tasks in them which may have different data for you to work on. All the related files can be downloaded from http://user.ceng.metu. edu.tr/~artun/ceng499/hw3_files.zip.
2 Part 1: Decision Tree
Figure 1: Iris dataset explanatory images. Image taken from https://rpubs.com/yao33/611068.
2.1 Dataset The dataset is the Iris Dataset [1]; however, you need to use the version we provided since we split it into train test sets. The task is to classify the iris flower species: Iris Setosa, Iris Versicolor, and Iris Virginica. The data contains the measurements of sepal length, sepal width, petal length, petal width in cm. Therefore, the data is four dimensional. An explanatory image can be seen at Fig. 1. The values are all numerical and there are no missing values. The dataset is labeled and have separate train and test sets. It is in the form of saved numpy array and you can load it using the following code: train_data = np . load( ’hw3_data/ i r i s /train_data . npy ’ ) train_labels = np . load( ’hw3_data/ i r i s / train_labels . npy ’ ) test_data = np . load( ’hw3_data/ i r i s /test_data . npy ’ ) test_labels = np . load( ’hw3_data/ i r i s / test_labels . npy ’ )
1



