STAT151A Code homework 4: Due March 8th

Author

Your name here

This coding assignment will use your work from homework 3 as a starting point. For the assignment, we’ll assume that

1 Variability in the training set

Fix N=500, P=3, and set β to some values you choose. Set ΣX to have correlation 0.9 off the diagonal and 1.0 on the diagonal. Set σ2=βΣXβ.

Take xnew to be a single fixed draw from the distribution of regressors, and draw a large number (> 5000) of εnew,i, giving a large number of draws from ynew,i|xnew. The ynew,i should be normally distributed with mean xnewβ and variance σ2.

(a)

Draw a single training set X, Y, and use it to construct an 80% interval for ynew. Find the proportion of ynew,i that lie in the interval.

(b)

Repeat (a), but with 10 different training sets. You can keep the ynew,i the same. For each different training set, plot the corresponding intervals. Are they different from one another?
By a lot or a little?

(c)

Repeat (b), but with N=20. You can keep the ynew,i the same. How do the results compare to (b)? Why?

(d)

Repeat (b), but with σ very small: specifically, set σ2=0.01βΣXβ. You will need to draw new ynew,i. How do the results compare to (b)?

(e)

Repeat (b), but now take xnew to be the smallest eigenvector of ΣX. (You can find the smallest eigenvector of ΣX using the R function eigen.)
You will need to draw new ynew,i. How do the results compare to (b)?