Multiple Choice Question Review
Stat 151A: Linear Models
Multiple choice questions
Below, you will find the multiple choice questions from the quizzes, but not the candidate responses.
I encourage you to try to think of correct (and incorrect) answers for yourself as a way of preparing for the final exam!
Questions
Quiz 0
- If a square, symmetric matrix \(\boldsymbol{A}\) is invertible, then…
- Let \(\boldsymbol{X}\) denote a full-rank \(N \times P\) matrix with \(N > P\). Then…
- Suppose that \(\boldsymbol{u}\) and \(\boldsymbol{v}\) are orthogonal \(N\)-vectors with \(N \ge 2\). Then…
- Let \(\boldsymbol{X}\) denote an \(N \times P\) matrix, with \(P < N\), \(\boldsymbol{\beta}\) a nonzero \(P\)-vector, and \(\boldsymbol{Y}\) an \(N\)-vector. Which of these matrix expressions is not well-defined?
- Suppose that \(x_n\) are IID Gaussian with mean \(\mathbb{E}\left[x_n\right] = 2\) and \(\mathrm{Var}\left(x_n\right) = 9\). Which of the following expressions diverges (goes to positive or negative infinity) as \(N \rightarrow \infty\)?
Quiz 1
- Suppose that \(\boldsymbol{x}_n = \boldsymbol{A}\boldsymbol{z}_n\) for all \(n\), where \(\boldsymbol{A}\) is invertible but not symmetric. For the regressions \(y\sim \boldsymbol{\beta}^\intercal\boldsymbol{x}\) and \(y\sim \boldsymbol{\gamma}^\intercal\boldsymbol{z}\), it is necessarily true that…
- Which of the following is never a well justified reason to choose \(\hat{\boldsymbol{\beta}}\) by minimizing the mean of squared errors \(\frac{1}{N} \sum_{n=1}^N(y_n 3. \boldsymbol{x}_n^\intercal\boldsymbol{\beta})^2\)?
- Let \(\underset{\boldsymbol{X}}{\boldsymbol{P}} = \boldsymbol{X}(\boldsymbol{X}^\intercal\boldsymbol{X})^{-1} \boldsymbol{X}^\intercal\) denote the projection matrix onto the columns of \(\boldsymbol{X}\), where \(\boldsymbol{X}\) is full rank. Which of the following expressions is false?
- Suppose that \(x_{n1}\) and \(x_{n2}\) are mean zero and very highly (but not perfectly) positively correlated, and we regress \(y_n \sim \beta_1 x_{n1} + \beta_2 x_{n2}\). Then…
- Let \(\boldsymbol{S}_1, \ldots, \boldsymbol{S}_K\) denote a partition of the space of regressors, so that for each \(n\), \(\boldsymbol{x}_n \in \boldsymbol{S}_k\) for exactly one \(k\). Let \(z_{nk} = \mathbb{I}\left(\boldsymbol{x}_n \in S_k\right)\), and consider regressing \(y\sim \beta_1 z_{n1}+ \ldots + \beta_K z_{nK}\), without a constant. Assume that each \(\boldsymbol{S}_k\) has at least one observation in it. Then, conditionally on the regressors…
Quiz 2
For the multiple choice questions, recall our four classes of assumptions:
- The normal assumption: \(y_n \vert \boldsymbol{x}_n \sim \mathcal{N}\left({\boldsymbol{\beta}^{*}}^\intercal\boldsymbol{x}_n, \sigma^2\right)\).
- The homoskedastic assumption: \(\varepsilon_n = y_n - {\boldsymbol{\beta}^{*}}^\intercal\boldsymbol{x}_n\), with \(\mathbb{E}\left[\varepsilon_n \vert \boldsymbol{x}_n\right] = 0\), and \(\mathrm{Var}\left(\boldsymbol{\varepsilon}_n | \boldsymbol{x}_n\right) = \sigma^2\).
- The heteroskedastic assumption: Like (B), but \(\mathrm{Var}\left(\boldsymbol{\varepsilon}_n | \boldsymbol{x}_n\right) = \sigma_n^2\), where \(\sigma_n\) is some function of \(\boldsymbol{x}_n\) that is different for different \(\boldsymbol{x}_n\).
- The machine learning assumption: \((\boldsymbol{x}_n, y_n)\) are IID.
In each assumption assume that the pairs \((\boldsymbol{x}_n, y_n)\) are IID and that \(\boldsymbol{X}\) is full-rank, and that all necessary conditions hold for the application of the CLT and LLN as in the lecture notes.
Recall that \(\hat{\sigma}^2 = \frac{1}{N - P} \sum_{n=1}^N(y_n - \hat{\boldsymbol{\beta}}^\intercal\boldsymbol{x}_n)^2\).
- Under the normal assumption (A), under the null, the standard t-statistic based on \(\hat{\sigma}\)…
- Under the homoskedastic assumption (B), under the null, the standard t-statistic based on \(\hat{\sigma}\)…
- Under the heteroskedastic assumption (C), under the null, the standard t-statistic based on the sandwich covariance…
- Under the heteroskedastic assumption (C), inference that is incorrectly based on the homoskedastic assumption (B) will, in general…
- Let \({\boldsymbol{\beta}^{*}}:= \mathbb{E}\left[\boldsymbol{x}\boldsymbol{x}^\intercal\right]^{-1} \mathbb{E}\left[\boldsymbol{x}y\right]\). Under the machine learning assumption (D),
Quiz 3
- In a machine learning problem with ridge regression, it is typically best practice to select the ridge parameter using…
- When running ridge regression, as you decrease the ridge penalty parameter…
- For machine learning, a good reason to use a highly expressive spline basis is…
- For machine learning, you don’t want your spine basis to be too expressive because…
- In Bayesian analysis, the ridge parameter corresponds to…