$$
\newcommand{\mybold}[1]{\boldsymbol{#1}}
\newcommand{\trans}{\intercal}
\newcommand{\norm}[1]{\left\Vert#1\right\Vert}
\newcommand{\abs}[1]{\left|#1\right|}
\newcommand{\bbr}{\mathbb{R}}
\newcommand{\bbz}{\mathbb{Z}}
\newcommand{\bbc}{\mathbb{C}}
\newcommand{\gauss}[1]{\mathcal{N}\left(#1\right)}
\newcommand{\chisq}[1]{\mathcal{\chi}^2_{#1}}
\newcommand{\studentt}[1]{\mathrm{StudentT}_{#1}}
\newcommand{\fdist}[2]{\mathrm{FDist}_{#1,#2}}
\newcommand{\argmin}[1]{\underset{#1}{\mathrm{argmin}}\,}
\newcommand{\projop}[1]{\underset{#1}{\mathrm{Proj}}\,}
\newcommand{\proj}[1]{\underset{#1}{\mybold{P}}}
\newcommand{\expect}[1]{\mathbb{E}\left[#1\right]}
\newcommand{\prob}[1]{\mathbb{P}\left(#1\right)}
\newcommand{\dens}[1]{\mathit{p}\left(#1\right)}
\newcommand{\var}[1]{\mathrm{Var}\left(#1\right)}
\newcommand{\cov}[1]{\mathrm{Cov}\left(#1\right)}
\newcommand{\sumn}{\sum_{n=1}^N}
\newcommand{\meann}{\frac{1}{N} \sumn}
\newcommand{\cltn}{\frac{1}{\sqrt{N}} \sumn}
\newcommand{\trace}[1]{\mathrm{trace}\left(#1\right)}
\newcommand{\diag}[1]{\mathrm{Diag}\left(#1\right)}
\newcommand{\grad}[2]{\nabla_{#1} \left. #2 \right.}
\newcommand{\gradat}[3]{\nabla_{#1} \left. #2 \right|_{#3}}
\newcommand{\fracat}[3]{\left. \frac{#1}{#2} \right|_{#3}}
\newcommand{\W}{\mybold{W}}
\newcommand{\w}{w}
\newcommand{\wbar}{\bar{w}}
\newcommand{\wv}{\mybold{w}}
\newcommand{\X}{\mybold{X}}
\newcommand{\x}{x}
\newcommand{\xbar}{\bar{x}}
\newcommand{\xv}{\mybold{x}}
\newcommand{\Xcov}{\Sigmam_{\X}}
\newcommand{\Xcovhat}{\hat{\Sigmam}_{\X}}
\newcommand{\Covsand}{\Sigmam_{\mathrm{sand}}}
\newcommand{\Covsandhat}{\hat{\Sigmam}_{\mathrm{sand}}}
\newcommand{\Z}{\mybold{Z}}
\newcommand{\z}{z}
\newcommand{\zv}{\mybold{z}}
\newcommand{\zbar}{\bar{z}}
\newcommand{\Y}{\mybold{Y}}
\newcommand{\Yhat}{\hat{\Y}}
\newcommand{\y}{y}
\newcommand{\yv}{\mybold{y}}
\newcommand{\yhat}{\hat{\y}}
\newcommand{\ybar}{\bar{y}}
\newcommand{\res}{\varepsilon}
\newcommand{\resv}{\mybold{\res}}
\newcommand{\resvhat}{\hat{\mybold{\res}}}
\newcommand{\reshat}{\hat{\res}}
\newcommand{\betav}{\mybold{\beta}}
\newcommand{\betavhat}{\hat{\betav}}
\newcommand{\betahat}{\hat{\beta}}
\newcommand{\betastar}{{\beta^{*}}}
\newcommand{\bv}{\mybold{\b}}
\newcommand{\bvhat}{\hat{\bv}}
\newcommand{\alphav}{\mybold{\alpha}}
\newcommand{\alphavhat}{\hat{\av}}
\newcommand{\alphahat}{\hat{\alpha}}
\newcommand{\omegav}{\mybold{\omega}}
\newcommand{\gv}{\mybold{\gamma}}
\newcommand{\gvhat}{\hat{\gv}}
\newcommand{\ghat}{\hat{\gamma}}
\newcommand{\hv}{\mybold{\h}}
\newcommand{\hvhat}{\hat{\hv}}
\newcommand{\hhat}{\hat{\h}}
\newcommand{\gammav}{\mybold{\gamma}}
\newcommand{\gammavhat}{\hat{\gammav}}
\newcommand{\gammahat}{\hat{\gamma}}
\newcommand{\new}{\mathrm{new}}
\newcommand{\zerov}{\mybold{0}}
\newcommand{\onev}{\mybold{1}}
\newcommand{\id}{\mybold{I}}
\newcommand{\sigmahat}{\hat{\sigma}}
\newcommand{\etav}{\mybold{\eta}}
\newcommand{\muv}{\mybold{\mu}}
\newcommand{\Sigmam}{\mybold{\Sigma}}
\newcommand{\rdom}[1]{\mathbb{R}^{#1}}
\newcommand{\RV}[1]{\tilde{#1}}
\def\A{\mybold{A}}
\def\A{\mybold{A}}
\def\av{\mybold{a}}
\def\a{a}
\def\B{\mybold{B}}
\def\S{\mybold{S}}
\def\sv{\mybold{s}}
\def\s{s}
\def\R{\mybold{R}}
\def\rv{\mybold{r}}
\def\r{r}
\def\V{\mybold{V}}
\def\vv{\mybold{v}}
\def\v{v}
\def\U{\mybold{U}}
\def\uv{\mybold{u}}
\def\u{u}
\def\W{\mybold{W}}
\def\wv{\mybold{w}}
\def\w{w}
\def\tv{\mybold{t}}
\def\t{t}
\def\Sc{\mathcal{S}}
\def\ev{\mybold{e}}
\def\Lammat{\mybold{\Lambda}}
$$
This coding assignment will use your work from homework 3 as a starting point. For the assignment, we’ll assume that
- \(\y_n = \betav^\trans \xv_n + \res_n\) for all \(n\), including new datapoints
- \(\res_n \sim \gauss{0,\sigma^2}\)
- The regressors \(\xv_n\) are also random with covariance matrix \(\Xcov\).
Variability in the training set
Fix \(N = 500\), \(P = 3\), and set \(\betav\) to some values you choose. Set \(\Xcov\) to have correlation \(0.9\) off the diagonal and \(1.0\) on the diagonal. Set \(\sigma^2 = \beta^\trans \Xcov \beta\).
Take \(\xv_\new\) to be a single fixed draw from the distribution of regressors, and draw a large number (> 5000) of \(\res_{\new,i}\), giving a large number of draws from \(\y_{\new,i} | \xv_\new\). The \(\y_{\new,i}\) should be normally distributed with mean \(\xv_\new^\trans \beta\) and variance \(\sigma^2\).
(a)
Draw a single training set \(\X\), \(\Y\), and use it to construct an 80% interval for \(\y_\new\). Find the proportion of \(\y_{\new,i}\) that lie in the interval.
(b)
Repeat (a), but with 10 different training sets. You can keep the \(\y_{\new,i}\) the same. For each different training set, plot the corresponding intervals. Are they different from one another?
By a lot or a little?
(c)
Repeat (b), but with \(N = 20\). You can keep the \(\y_{\new,i}\) the same. How do the results compare to (b)? Why?
(d)
Repeat (b), but with \(\sigma\) very small: specifically, set \(\sigma^2 = 0.01 \beta^\trans \Xcov \beta\). You will need to draw new \(\y_{\new,i}\). How do the results compare to (b)?
(e)
Repeat (b), but now take \(\xv_\new\) to be the smallest eigenvector of \(\Xcov\). (You can find the smallest eigenvector of \(\Xcov\) using the R
function eigen
.)
You will need to draw new \(\y_{\new,i}\). How do the results compare to (b)?