Computational Statistics & Data Mining - Tutorial 4 PDF

Title	Computational Statistics & Data Mining - Tutorial 4
Course	Computational Statistics & Data Mining
Institution	University of Melbourne
Pages	1
File Size	40 KB
File Type	PDF
Total Downloads	95
Total Views	150

Preview

CLICK TO PREVIEW PDF

Summary

Computational Statistics & Data Mining - Tutorial 4. In this question we are interested in deriving an algorithm for solving Lasso. Give the expression of the log likelihood...

Description

MAST90083

Computational Statistics & Data Mining

Linear Regression

Tutorial & Practical 4: Model Selection Question 1 In this question we are interested in deriving an algorithm for solving Lasso. Given the model y = Xβ + ǫ where y ∈ Rn , X ∈ Rn×p and ǫ ∈ Rn ∼ N (0, σ 2 In ). Let βˆ be the estimate of β obtained by least square estimation. 1. Let y ∈ R, find the solution u ∈ R that minimizes (y − u)2 + λ|u| 2. Plot the solution as a function of y 3. Use this solution to derive an algorithm for solving min ky − Xβk2 + λ|β| β

Question 2 Let Y = (y1 , ..., yn )⊤ be an n×q matrix of observations for which we postulate the parametric model Y = XB + ε where vec(ε) ∼ N (0, Σ ⊗ In ) where X is a known n × k design matrix of rank k, B is a k × q matrix of unknown parameters, ε is the n × q matrix of errors and Σ is a q × q matrix of error covariance. 1. Give the expression of the log likelihood 2. Find the number of parameters involved in the model 3. Derive the expressions of AIC and BIC

Question 3 Let y ∈ Rn be a vector of observation for which we postulate the linear model y = Xβ + ǫ where y ∈ Rn , X ∈ Rn×k , β ∈ Rk and ǫ ∈ Rn ∼ N (0, σ 2 In ). The dimension k of β is estimated using AIC . 1. Give the form of the AIC criterion 2. Derive the expression of the probability of overfitting. 1...