Multiple Choice Question Review

Stat 151A: Linear Models

Multiple choice questions

Below, you will find the multiple choice questions from the quizzes, but not the candidate responses.

I encourage you to try to think of correct (and incorrect) answers for yourself as a way of preparing for the final exam!

Questions

Quiz 0

If a square, symmetric matrix \(\boldsymbol{A}\) is invertible, then…
Let \(\boldsymbol{X}\) denote a full-rank \(N \times P\) matrix with \(N > P\). Then…
Suppose that \(\boldsymbol{u}\) and \(\boldsymbol{v}\) are orthogonal \(N\)-vectors with \(N \ge 2\). Then…
Let \(\boldsymbol{X}\) denote an \(N \times P\) matrix, with \(P < N\), \(\boldsymbol{\beta}\) a nonzero \(P\)-vector, and \(\boldsymbol{Y}\) an \(N\)-vector. Which of these matrix expressions is not well-defined?
Suppose that \(x_n\) are IID Gaussian with mean \(\mathbb{E}\left[x_n\right] = 2\) and \(\mathrm{Var}\left(x_n\right) = 9\). Which of the following expressions diverges (goes to positive or negative infinity) as \(N \rightarrow \infty\)?

Quiz 1

Suppose that \(\boldsymbol{x}_n = \boldsymbol{A}\boldsymbol{z}_n\) for all \(n\), where \(\boldsymbol{A}\) is invertible but not symmetric. For the regressions \(y\sim \boldsymbol{\beta}^\intercal\boldsymbol{x}\) and \(y\sim \boldsymbol{\gamma}^\intercal\boldsymbol{z}\), it is necessarily true that…
Which of the following is never a well justified reason to choose \(\hat{\boldsymbol{\beta}}\) by minimizing the mean of squared errors \(\frac{1}{N} \sum_{n=1}^N(y_n 3. \boldsymbol{x}_n^\intercal\boldsymbol{\beta})^2\)?
Let \(\underset{\boldsymbol{X}}{\boldsymbol{P}} = \boldsymbol{X}(\boldsymbol{X}^\intercal\boldsymbol{X})^{-1} \boldsymbol{X}^\intercal\) denote the projection matrix onto the columns of \(\boldsymbol{X}\), where \(\boldsymbol{X}\) is full rank. Which of the following expressions is false?
Suppose that \(x_{n1}\) and \(x_{n2}\) are mean zero and very highly (but not perfectly) positively correlated, and we regress \(y_n \sim \beta_1 x_{n1} + \beta_2 x_{n2}\). Then…
Let \(\boldsymbol{S}_1, \ldots, \boldsymbol{S}_K\) denote a partition of the space of regressors, so that for each \(n\), \(\boldsymbol{x}_n \in \boldsymbol{S}_k\) for exactly one \(k\). Let \(z_{nk} = \mathbb{I}\left(\boldsymbol{x}_n \in S_k\right)\), and consider regressing \(y\sim \beta_1 z_{n1}+ \ldots + \beta_K z_{nK}\), without a constant. Assume that each \(\boldsymbol{S}_k\) has at least one observation in it. Then, conditionally on the regressors…

Quiz 2

For the multiple choice questions, recall our four classes of assumptions:

1. The normal assumption: \(y_n \vert \boldsymbol{x}_n \sim \mathcal{N}\left({\boldsymbol{\beta}^{*}}^\intercal\boldsymbol{x}_n, \sigma^2\right)\).
1. The homoskedastic assumption: \(\varepsilon_n = y_n - {\boldsymbol{\beta}^{*}}^\intercal\boldsymbol{x}_n\), with \(\mathbb{E}\left[\varepsilon_n \vert \boldsymbol{x}_n\right] = 0\), and \(\mathrm{Var}\left(\boldsymbol{\varepsilon}_n | \boldsymbol{x}_n\right) = \sigma^2\).
1. The heteroskedastic assumption: Like (B), but \(\mathrm{Var}\left(\boldsymbol{\varepsilon}_n | \boldsymbol{x}_n\right) = \sigma_n^2\), where \(\sigma_n\) is some function of \(\boldsymbol{x}_n\) that is different for different \(\boldsymbol{x}_n\).
1. The machine learning assumption: \((\boldsymbol{x}_n, y_n)\) are IID.

In each assumption assume that the pairs \((\boldsymbol{x}_n, y_n)\) are IID and that \(\boldsymbol{X}\) is full-rank, and that all necessary conditions hold for the application of the CLT and LLN as in the lecture notes.

Recall that \(\hat{\sigma}^2 = \frac{1}{N - P} \sum_{n=1}^N(y_n - \hat{\boldsymbol{\beta}}^\intercal\boldsymbol{x}_n)^2\).

Under the normal assumption (A), under the null, the standard t-statistic based on \(\hat{\sigma}\)…
Under the homoskedastic assumption (B), under the null, the standard t-statistic based on \(\hat{\sigma}\)…
Under the heteroskedastic assumption (C), under the null, the standard t-statistic based on the sandwich covariance…
Under the heteroskedastic assumption (C), inference that is incorrectly based on the homoskedastic assumption (B) will, in general…
Let \({\boldsymbol{\beta}^{*}}:= \mathbb{E}\left[\boldsymbol{x}\boldsymbol{x}^\intercal\right]^{-1} \mathbb{E}\left[\boldsymbol{x}y\right]\). Under the machine learning assumption (D),

Quiz 3

In a machine learning problem with ridge regression, it is typically best practice to select the ridge parameter using…
When running ridge regression, as you decrease the ridge penalty parameter…
For machine learning, a good reason to use a highly expressive spline basis is…
For machine learning, you don’t want your spine basis to be too expressive because…
In Bayesian analysis, the ridge parameter corresponds to…