[SOLVED] COMP9417 Tutorial V- Nonparametric Modelling

35.00 $

Category:

Description

5/5 - (1 vote)

Question 1. Expressiveness of Trees
Give decision trees to represent the following Boolean functions, where the variables A, B, C and D have values t or f, and the class value is either True or False. Can you observe any effect of the increasing complexity of the functions on the form of their expression as decision trees ?
(a) A ∧ ¬B
(b) A ∨ [B ∧ C]
(c) A XOR B
(d) [A ∧ B] ∨ [C ∧ D]
Question 2. Decision Tree Learning
(a) Assume we learn a decision tree to predict class Y given attributes A, B and C from the following training set, with no pruning.
A B C Y
0 0 0 0
0 0 1 0
0 0 1 0
0 1 0 0
0 1 1 0
0 1 1 1
1 0 0 0
1 0 1 1
1 1 0 1
1 1 0 1
1 1 1 0
1 1 1 1
What would be the training set error for this dataset ? Express your answer as the number of examples out of twelve that would be misclassified.
(b) One nice feature of decision tree learners is that they can learn trees to do multi-class classification,
i.e., where the problem is to learn to classify each instance into exactly one of k > 2 classes.
Suppose a decision tree is to be learned on an arbitrary set of data where each instance has a discrete class value in one of k > 2 classes. What is the maximum training set error, expressed as a
fraction, that any dataset could have ?
1
Question 3. Linear Smoothing
In the lab this week we introduced linear smoothing, also known as kernel smoothing, and we implement it from scratch and apply it to a simulated dataset. The following figure is taken from The Elements of Statistical Learning by Hastie, Tibshirani and Friedman and is an excellent portrayal of the linear smoother at work. The data are simulated from the following data generating process:
.
On the left, we see the ‘nearest-neighbour’ kernel at work, which we refer to as the boxcar kernel in the lab, and on the right we see the Epanechnikov kernel, also introduced in the lab. We include their definitions here:
K(u) = 1{|u| ≤ 1/2} box-car kernel
Epanechnikov kernel
Recall also from the lab that a linear smoother prediction takes the form:
.

Review the lab Linear smoothing section and then write down a few lines describing what is happening in the figure. Be sure to describe in detail what each of the following represents:
(i) the blue curve
(ii) the black scatter
(iii) the red scatter
Page 2
(iv) the yellow region
(v) the horizontal red line
(vi) the red point on the horizontal red line
(vii) the green curve
Page 3