Description
Problem 1: Recall the naive Bayes model.
- For simplicity, we only consider binary features
- The generation model is
Here: A Bernoulli distribution parametrized by π means that It is a special case of categorical distributions in that only two cases are considered.Pr[π = 1] = ΟΒ Β Β Β Β Pr[π = 0] = 1 β Ο andΒ .
- Such a model can be used to represent a document in text classification. For example, the target indicates Spam or NotSpam. The feature indicates if a word in the vocabulary occurs in the document.
Show that the decision boundary of naive Bayes is also linear.
Problem 2: Give a set of parameters of the below stack of logistic regression models to accomplish the non-linear classification.
Problem 3: Using the above example to show the optimization of neural networks is non-convex.
Hint: As mentioned in class, you can renamemodels. Interpolating them will give you a very bad model.π1 as π2 and π2 as π1. Then, you will have two



