CMSC422 fall2017 Practice Problems Midterm 2 PDF

Title	CMSC422 fall2017 Practice Problems Midterm 2
Course	Introduction to Machine Learning
Institution	University of Maryland
Pages	2
File Size	54.8 KB
File Type	PDF
Total Downloads	106
Total Views	158

Preview

CLICK TO PREVIEW PDF

Summary

Fall 2017 Review of exam 2 material for CMSC422...

Description

Practice Problems for Midterm 2 CMSC 422, Fall 2017 1. Naive Bayes (a) Suppose you have a classification problem in which each item has two features. The features are not independent of each other. True or false: it cannot be the case that a naive Bayes classifier based on this data always does exactly the same thing as the Bayes optimal classifier. Explain your reasoning. (b) Suppose I go to ten random pizzerias in New York and order a veggie supreme pizza. 9 of the pizzas are thin crust, and one is deep dish. 7 of the pizzas have mushrooms and three have spinach. Then I go to Chicago and go to ten random pizzerias and order a veggie supreme pizza. 8 of the pizzas are deep dish, and two are thin crust. 8 of the pizzas have mushrooms, but only one has spinach. Two weeks later, I am kidnapped and find myself held captive. I don’t know where I am, but from the sounds outside I can infer that I am either in New York or Chicago; either seems equally likely. My captors bring me a veggie supreme pizza. It is deep dish, with spinach and mushrooms. Using Naive Bayes, would I estimate that it is more likely that I am in New York or Chicago? What would I estimate is the probability that I am in Chicago? 2. Logistic Regression Suppose we perform logistic regression for 1D data, and wind up with parameters w = −1, b = 2. What is the probability that the logistic regression model will assign to points with x = 1, 2, 3? For what value of x will the model say that the two classes are equally likely? 3. Neural Networks (a) Consider a neural network with one hidden layer, max pooling, two inputs, and no bias terms, for simplicity. There are two input units, two hidden units, and one output unit. Call their activations a11 , a12 , a21 , a22 , a3 . We have four weights. The inputs are x1 , x2 . So we have: a11

=

x1

a12

=

x2

a21

=

w11 a11 + w12 a12

a22

=

w21 a11 + w22 a12

a3

=

max(a21 , a22 )

As a loss we use L = (y − a3 )2 , where y is the label. If we initialize the weights as: w11 = 1

w12 = 2 1

w21 = 3

w22 = 4

and have one training example with x1 = 1, x2 = 3, y = 18, what is the gradient of the loss? (b) What is the difference between a convolutional neural network and a regular neural network? 4. Linear Projection (a) Projection doesn’t have to be just for linear subspaces. You can also project points onto something that is not linear. Define a projection function, f , for 2D points, (x, y) so that f (x, y) projects (x, y) onto a unit circle (the circle x2 + y2 = 1). Explain how such a projection should work, and give an expression for the function f . (b) Give a matrix that will take a 2D point, (x, y) and produce the closest point to (x, y) that lies on the line y = 2x. 5. PCA (a) Suppose we perform PCA on a set of 2D points, in which all points have the form: ai u for a fixed vector u and for different values of ai . Show that when we perform PCA we will get u as the principal component. Do this two ways: look at the cost function that PCA minimizes, and use the scatter matrix of the data. (b) Suppose we perform PCA with the points (2, 3), (−1, 2), (3, 0), (4, 1). What is the scatter matrix? (c) Suppose we perform PCA with the points (1, 1), (1, 0), (−1, 0), (0, 1), (0, −1), (−1, −1). True or false, the principal component is in the direction (√21 , √12 )? Explain your answer using detailed math. That is, don’t just draw a picture and say that this answer looks right/wrong. 6. Kernels (a) Suppose x P = (x1 , ..., xn ) and z = (z1 , z2 , ..., zn ). Consider the function n K (x, z) = i=1 xzii . Show that K cannot be used as a kernel. 7. SVMs (a) Suppose we have a point from class 1 with coordinates (4,3) and a point from class -1 with coordinates (-2,1). We have a linear separator described by the line y + x = 1. What is the margin of the separator? (b) What is max-margin separator for these two points? (c) What is the margin of the max margin separator?

2...