Name: CS412-Homework 5 Solved
SKU: 60265
Availability: InStock

Description

5/5 - (1 vote)

Question 1

Suppose we want to predict whether a restaurant is popular based on its price, delivery, and cuisine, and we collected the training data as shown in table 1. Answer the following questions.

Table 1: Training dataset (P – popular, NP – not popular)

ID	Cuisine	Price	Delivery	Popularity
1	Thai	$$	Yes	P
2	Korean	$$$	No	NP
3	Thai	$$	Yes	P
4	American	$	Yes	P
5	American	$	No	P
6	Korean	$$	No	P
7	Thai	$	Yes	P
8	Korean	$$	Yes	P
9	American	$$$	No	NP
10	American	$	Yes	NP

1a. Based on the training data, we want to construct a Naive Bayes classifier. (No smoothing is required.) Please estimate the following terms: 1a(i). [4] Pr(Popularity = ‘P’)

1a(ii). [4] Pr(Popularity = ‘NP’)

1a(iii). [4] Pr(Price = ‘$’, Delivery = ‘Yes’ , Cuisine = ‘Korean’ | Popularity = ‘P’)

1a(iv). [4] Pr(Price = ‘$’, Delivery = ‘Yes’ , Cuisine = ‘Korean’ | Popularity = ‘NP’)

1b.[6] Suppose a restaurant has the values: Price = ‘$’, Delivery = ‘Yes’ , Cuisine = ‘Korean’. Based on the calculation in part (1a.), is this restaurant classified as popular?

1c. [4] Design an ensemble method for Naive Bayes to further improve the accuracy and briefly describe the steps.

1d. [4] Describe the metrics that can effectively evaluate the classification of data with rare positive examples.

2 Question 2

We have eight training points, which are plotted in figure 1.

x1	x2	y
1	0.5	+1
2	1.2	+1
2.5	2	+1
3	2	+1
1.5	2	-1
2.3	3	-1
1.2	1.9	-1
0.8	1	-1

Table 2: Dataset Figure 1: 2-D scatterplot of the Dataset

Also, we have four test points with their true labels. Please answer following questions.

Table 3: Test Dataset

x1	x2	y
2.7	2.7	+1
2.5	1	+1
1.5	2.5	-1
1.2	1	-1

2a.[10] Perform k-nearest neighbor classification with K = 1. What’s the testing error? (Please use Euclidean distance. Show your reasoning.)

2b. [10] Do the same thing as question 2a with K = 2.

2c. [10] A linear classifier f(x) = a ∗ x₁+ b ∗ x₂+ c works as follows: if f(x) >= 0 , predict x as +1; otherwise, predict x as −1. Design a reasonable linear classifier(i.e. choose proper a, b, c). What’s the training error? What about the testing error? Show your reasoning(Your design doesn’t need to be optimal).

2d. [10] Compare KNN and linear classification method(e.g. SVM). You may draw conclusions based on your experience with question 2a-2c.

3 Question 3

Suppose we want to cluster the following 13 points:

index	x1	x2
1	1	3
2	1	2
3	2	1
4	2	2
5	2	3
6	3	2
7	5	3
8	4	3
9	4	5
10	5	4
11	5	5
12	6	4
13	6	5

Table 4: Dataset Figure 2: 2-D scatterplot of the Dataset

3a. [10] Cluster above points using K-means algorithm with k = 2. Please use (0, 3) and (6, 4) as the initial center points for the two clusters. Show your reasoning.

3b. [10] Now we want to use DBSCAN, a density-based algorithm, with MinPts = 2, and Eps = 1.5. Outline your clustering process.

3c. [10] Please perform AGNES, a hierarchical clustering algorithm on above points. Please use single link method and adopt Euclidean distance as the dissimilarity measure.

[SOLVED] CS412-Homework 5

If Helpful Share:

Description

Question 1

2 Question 2

3 Question 3

Related products

Homework 1 CS 412 Algorithms: Design and Analysis

CS412 Project 5

CS412-Homework 4 General Purpose Classification Framework

Related in this category

More in this category

CS412-Homework 3

CS412-Homework 2

CS412-Homework 4 General Purpose Classification Framework

Homework 2 CS 412 Algorithms: Design and Analysis

Homework 1 CS 412 Algorithms: Design and Analysis

CS412 Project 5