[SOLVED] Machine learning Homework 2

30.00 $

Category:

Description

5/5 - (2 votes)

 

  1. Pen-and-paper [13v]

Four positive observations, {(𝐴

0) , (𝐡

1) , (𝐴1) , (𝐴

0)}, and four negative observations, {(𝐡

0) , (𝐡

0) , (𝐴1) , (𝐡

1)},

were collected. Consider the problem of classifying observations as positive or negative.

1) [4v] Compute the recall of a distance-weighted π‘˜NN with π‘˜ = 5 and distance 𝑑(𝐱1, 𝐱2) =

π»π‘Žπ‘šπ‘šπ‘–π‘›π‘”(𝐱1, 𝐱2)+

1

2

using leave-one-out evaluation schema (i.e., when classifying one

observation, use all remaining ones).

An additional positive observation was acquired, (𝐡

0), and

a

third

variable

𝑦

3

was

independently

monitored, yielding estimates 𝑦3|𝑃 = {1.2, 0.8, 0.5, 0.9,0.8

}

and

𝑦

3|𝑁 = {

1

,

0

.9, 1

.

2, 0.8}.

2) [4v] Considering the nine training observations, learn a Bayesian classifier assuming:

  1. i) 𝑦1 and 𝑦2 are dependent, ii) {𝑦1, 𝑦2} and {𝑦3} variable sets are independent and equally

important, and ii) 𝑦3 is normally distributed. Show all parameters.

Considering three testing observations, {((0𝐴1

.8) , Positive

) ,(

(

𝐡

1

1

)

,

Positive

)

,

(

(

𝐡

0

0

.9

)

,

Negative

)}.

3) [3v] Under a MAP assumption, compute

𝑃

(Positive

|𝐱

)

of

each

testing

observation.

4) [2v] Given a binary class variable, the default

decision

threshold

of

πœƒ

=

0

.5

,

𝑓(𝐱|πœƒ) = {

Positive 𝑃(Positive

|𝐱) >

πœƒ

Negative

otherwise

can be adjusted. Which decision threshold

– 0.3, 0.5 or

0.7 – optimizes

testing accuracy?

  1. Programming and critical analysis [7v]

Considering the pd_speech.arff dataset available at the course webpage.

5) [3v] Using sklearn, considering

a

10

-fold

stratified

cross

validation

(random=0

), plot

the

cumulative

testing confusion matrices of

π‘˜NN

(uniform

weights,

π‘˜ =

5, Euclidean

distance)

and

NaΓ―ve

Bayes

(Gaussian assumption). Use all

remaining

classifier

parameters

as default.

6) [2v] Using scipy, test the hypothesis β€œπ‘˜NN is statistically superior to NaΓ―ve Bayes regarding

accuracy”, asserting whether is true.

7) [2v] Enumerate three

possible

reasons

that

could underlie the observed differences in predictive

accuracy between

π‘˜NN

and NaΓ―ve

Bayes.

END