Description
Note: The problems in this assignment are to solved by hand. If you want to use a computer to help with the plots, you may; but you will probably find that unnecessary.
- In a 2-class problem with 1 feature, you are given the following data points:
S1 : −3, − 2, 0 S2 : −1, 2
- Give the k-nearest neighbor estimate of p(xS1) with k=3, for all x. Give both algebraic (simplest) form, and a plot. On the plot, show the location (x value) and height of (each) peak.
- Give the Parzen Windows estimate of p(xS2) with window function:
( ) ⎧ ⎪ 0.25, −2 ≤u< 2
Δ u =⎨
⎪⎩ 0, otherwise
Give both algebraic (simplest) form, and a plot. On the plot, show the x value of all significant points, and label the values of p(xS2) clearly.
- Estimate prior probabilities P(S1) and P(S2) from frequency of occurrence of the data points.
- Give an expression for the decision rule for a Bayes minimum error classifier using the density and probability estimates from (a)-(c). You may leave your answer in terms of pˆ(xS1), pˆ(xS2), Pˆ(S1), Pˆ(S2) , without plugging in for these quantities.
- Using the estimates you have made above, solve for the decision boundaries and regions of a Bayes minimum error classifier using the density and probability estimates from (a)-(c). Give your answer in 2 forms:
- Algebraic expressions of the decision rule, in simplest form (using numbers and variable x);
- A plot showing the decision boundaries and regions.
Tip: you may find it easiest to develop the algebraic solution and plot and the same time.
- Classify the points: x=−5, 0.1, 0.5 using the classifier you developed in (e).
- Separately, use a discriminative 3-NN classifier to classify the points x=−5, 0.1, 0.5. (Hint: if this takes you more than a few steps for each data point,
you are doing more work than necessary.)
Assignment continues on next page…
- [Comment: this problem is on parameter estimation, which is covered in Lecture 26 on Monday, 4/27.]
In a 1D problem (1 feature), we will estimate parameters for one class. We model the density p(xθ) as:
⎧
pxθ)=⎪⎨ θe−θx, x≥ 0
⎪⎩ 0, otherwise
in which θ≥0 .
You are given a dataset Z : x1,x2,!,xN , which are drawn i.i.d. from p(xθ).
In this problem, you may use for convenience the notation:
1 N
m!∑xi .
N i=1
- Solve for the maximum likelihood (ML) estimate θˆML , of θ, in terms of the given data points. Express your result in simplest form.
For parts (b) and (c) below, assume there is a prior for θ, as follows:
p(θ)=⎧⎪⎨ ae−aθ, θ≥ 0
⎪⎩ 0, otherwise
in which a≥0 .
- Solve for the maximum a posteriori (MAP) estimate θˆMAP , of θ, in terms of the given data points. Express your result in simplest form.
- Write θˆMAP as a function of θˆML and given parameters. Find σlθi→m∞θˆMAP , in which σθ is the standard deviation of the prior on θ. What does this limit correspond to in terms of our prior knowledge of θ?
| θ a | |
| 3. | [Extra credit] Comment: this problem is not more difficult than the regular-credit problems above; it is extra credit because the total length Problems 1 and 2 above is already sufficient and reasonable for one homework assignment. |
| In a 2-class problem with D features, you are to use Fisher’s Linear Discriminant to find an |
Hint: the standard deviation of θ for the given p(θ) is: σ =1 .
optimal 1D feature space. You are given that the scatter matrices for each class (calculated from the data for each class) are diagonal:
=⎡⎢⎢ ⎢⎢ σ12 σ2 0 ⎤⎥⎥⎥, S2 =⎢⎢⎡⎢ ρ12 0 ⎥⎥ ⎤ ⎥
S1 2 ⎥ ⎢ ρ22 ⎥
⎢⎢ ⎢⎣ 0 ! 2 ⎥⎥ ⎢⎢ 0 ! ρD2 ⎥ ⎥⎥⎦ σD ⎥⎦ ⎢⎣
and you are given the sample means for each class:
⎛⎜ m1(1) ⎞⎟ ⎛⎜ m1(2) ⎞⎟ m1 =⎜⎜ m2(1) ⎟⎟ , m2 =⎜⎜ m2(2) ⎟⎟ .
⎜ ! ⎟ ⎜ ! ⎟
⎜⎜⎝ m(D 1) ⎟⎟⎠ ⎜⎝⎜ m(D2) ⎟⎟ ⎠
- Find the Fisher’s Linear Discriminant w . Express in simplest form.
- Let D= Suppose σ12 =4σ22 , and ρ12 =4ρ22 , and:
⎛
m1 =⎜⎝ 22 ⎞⎟⎠ , m2 =⎛⎜⎝ −12 ⎞⎟⎠
Plot vectors m1, m2, (m1−m2), and w.
- Interpreting your answer of part (b), which makes more sense for a 1D feature space direction: (m1−m2) or w? Justify your answer.



