$$ \newcommand{\mybold}[1]{\boldsymbol{#1}} \newcommand{\trans}{\intercal} \newcommand{\norm}[1]{\left\Vert#1\right\Vert} \newcommand{\abs}[1]{\left|#1\right|} \newcommand{\bbr}{\mathbb{R}} \newcommand{\bbz}{\mathbb{Z}} \newcommand{\bbc}{\mathbb{C}} \newcommand{\gauss}[1]{\mathcal{N}\left(#1\right)} \newcommand{\chisq}[1]{\mathcal{\chi}^2_{#1}} \newcommand{\studentt}[1]{\mathrm{StudentT}_{#1}} \newcommand{\fdist}[2]{\mathrm{FDist}_{#1,#2}} \newcommand{\iid}{\overset{\mathrm{IID}}{\sim}} \newcommand{\argmin}[1]{\underset{#1}{\mathrm{argmin}}\,} \newcommand{\projop}[1]{\underset{#1}{\mathrm{Proj}}\,} \newcommand{\proj}[1]{\underset{#1}{\mybold{P}}} \newcommand{\expect}[1]{\mathbb{E}\left[#1\right]} \newcommand{\prob}[1]{\mathbb{P}\left(#1\right)} \newcommand{\dens}[1]{\mathit{p}\left(#1\right)} \newcommand{\var}[1]{\mathrm{Var}\left(#1\right)} \newcommand{\cov}[1]{\mathrm{Cov}\left(#1\right)} \newcommand{\sumn}{\sum_{n=1}^N} \newcommand{\meann}{\frac{1}{N} \sumn} \newcommand{\cltn}{\frac{1}{\sqrt{N}} \sumn} \newcommand{\trace}[1]{\mathrm{trace}\left(#1\right)} \newcommand{\diag}[1]{\mathrm{Diag}\left(#1\right)} \newcommand{\grad}[2]{\nabla_{#1} \left. #2 \right.} \newcommand{\gradat}[3]{\nabla_{#1} \left. #2 \right|_{#3}} \newcommand{\fracat}[3]{\left. \frac{#1}{#2} \right|_{#3}} \newcommand{\W}{\mybold{W}} \newcommand{\w}{w} \newcommand{\wbar}{\bar{w}} \newcommand{\wv}{\mybold{w}} \newcommand{\X}{\mybold{X}} \newcommand{\x}{x} \newcommand{\xbar}{\bar{x}} \newcommand{\xv}{\mybold{x}} \newcommand{\Xcov}{\mybold{M}_{\X}} \newcommand{\Xcovhat}{\hat{\mybold{M}}_{\X}} \newcommand{\Covsand}{\Sigmam_{\mathrm{sand}}} \newcommand{\Covsandhat}{\hat{\Sigmam}_{\mathrm{sand}}} \newcommand{\Z}{\mybold{Z}} \newcommand{\z}{z} \newcommand{\zv}{\mybold{z}} \newcommand{\zbar}{\bar{z}} \newcommand{\Y}{\mybold{Y}} \newcommand{\Yhat}{\hat{\Y}} \newcommand{\y}{y} \newcommand{\yv}{\mybold{y}} \newcommand{\yhat}{\hat{\y}} \newcommand{\ybar}{\bar{y}} \newcommand{\res}{\varepsilon} \newcommand{\resv}{\mybold{\res}} \newcommand{\resvhat}{\hat{\mybold{\res}}} \newcommand{\reshat}{\hat{\res}} \newcommand{\betav}{\mybold{\beta}} \newcommand{\betavhat}{\hat{\betav}} \newcommand{\betahat}{\hat{\beta}} \newcommand{\betastar}{{\beta^{*}}} \newcommand{\betavstar}{{\betav^{*}}} \newcommand{\loss}{\mathscr{L}} \newcommand{\losshat}{\hat{\loss}} \newcommand{\f}{f} \newcommand{\fhat}{\hat{f}} \newcommand{\bv}{\mybold{\b}} \newcommand{\bvhat}{\hat{\bv}} \newcommand{\alphav}{\mybold{\alpha}} \newcommand{\alphavhat}{\hat{\av}} \newcommand{\alphahat}{\hat{\alpha}} \newcommand{\omegav}{\mybold{\omega}} \newcommand{\gv}{\mybold{\gamma}} \newcommand{\gvhat}{\hat{\gv}} \newcommand{\ghat}{\hat{\gamma}} \newcommand{\hv}{\mybold{\h}} \newcommand{\hvhat}{\hat{\hv}} \newcommand{\hhat}{\hat{\h}} \newcommand{\gammav}{\mybold{\gamma}} \newcommand{\gammavhat}{\hat{\gammav}} \newcommand{\gammahat}{\hat{\gamma}} \newcommand{\new}{\mathrm{new}} \newcommand{\zerov}{\mybold{0}} \newcommand{\onev}{\mybold{1}} \newcommand{\id}{\mybold{I}} \newcommand{\sigmahat}{\hat{\sigma}} \newcommand{\etav}{\mybold{\eta}} \newcommand{\muv}{\mybold{\mu}} \newcommand{\Sigmam}{\mybold{\Sigma}} \newcommand{\rdom}[1]{\mathbb{R}^{#1}} \newcommand{\RV}[1]{{#1}} \def\A{\mybold{A}} \def\A{\mybold{A}} \def\av{\mybold{a}} \def\a{a} \def\B{\mybold{B}} \def\b{b} \def\S{\mybold{S}} \def\sv{\mybold{s}} \def\s{s} \def\R{\mybold{R}} \def\rv{\mybold{r}} \def\r{r} \def\V{\mybold{V}} \def\vv{\mybold{v}} \def\v{v} \def\vhat{\hat{v}} \def\U{\mybold{U}} \def\uv{\mybold{u}} \def\u{u} \def\W{\mybold{W}} \def\wv{\mybold{w}} \def\w{w} \def\tv{\mybold{t}} \def\t{t} \def\Sc{\mathcal{S}} \def\ev{\mybold{e}} \def\Lammat{\mybold{\Lambda}} \def\Q{\mybold{Q}} \def\eps{\varepsilon} $$

Transforming responses.

$\,$

Goals

Discuss transformations of responses
- Effect of re-scaling and the units of the coefficients
- Example: Kleiber’s scaling law
- The log transformation
- Interpreting transformed regressions

Animal metabolic data

Let’s look at data taken from Table 2 of Kleiber 1947, “Body Size and Metabolic Rate.”

The data consists of weight (in Kg) and metabolic rate (in kCal per day) for the following animals:

Mouse, Rate, Guinea pig, Rabbit, Cat, Macaque, Dog, Goat, Chimpanzee, Sheep, Human Woman, Cow, Beef heifers, Shrew, Swiss mice, Dwarf mouse, Rat (giant), Rat (growth hormone), Swine, Steer calves, Elephant, Porpoise, Whale.

Kleiber’s question is: is there a systematic relationship between weight and metabolic rate? We are interested in this question because it can shed light on fundamental features of animal biology, not because we need to know metaboloic rate for some animal who we managed to get onto a scale.

Question: Is this an inference problem or a prediction problem?

Animal metabolic data

Let’s look at a simple linear regression of metabolism on weight.

lm_base_fit <- lm(Metabol_kcal_per_day ~ 1 + Weight_kg, kleiber_df)
print(summary(lm_base_fit))


Call:
lm(formula = Metabol_kcal_per_day ~ 1 + Weight_kg, data = kleiber_df)

Residuals:
    Min      1Q  Median      3Q     Max 
-1636.2 -1527.6  -974.0  -297.7  6462.5 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 1637.1946   417.0693   3.925 0.000401 ***
Weight_kg      0.1524     0.0357   4.269 0.000149 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2461 on 34 degrees of freedom
Multiple R-squared:  0.349, Adjusted R-squared:  0.3298 
F-statistic: 18.22 on 1 and 34 DF,  p-value: 0.0001488

There is a relationship between weight and metabolism, and it’s statistically significant!

Class dismissed?

Outliers

Let’s plot the data. What do you think of the fit now?

Moral: Always plot your data.

Note that a few points were flagged by Kleiber as not having comparable metabolic rates, including the whale. Let’s do the same analysis without them.

Now without outliers

lm_lin_fit <- lm(Metabol_kcal_per_day ~ 1 + Weight_kg, kleiber_comp_df)
print(summary(lm_lin_fit)$coefficients)

             Estimate  Std. Error   t value     Pr(>|t|)
(Intercept) 273.85249 101.7051223  2.692613 1.272021e-02
Weight_kg    14.76166   0.5497794 26.850158 2.046497e-19

What do you think of this fit?

Unit change

Question: We’re mixing up English units (kcal) and metric units (Kg). What would happen to the regression of we regress on Metabol_joule_per_day, using the conversion rate 4184 Joules per kcal?

lm_joule_fit <- lm(Metabol_joule_per_day ~ 1 + Weight_kg, kleiber_comp_df)

coefficients(lm_joule_fit)

(Intercept)   Weight_kg 
  1145798.8     61762.8

coefficients(lm_lin_fit)

(Intercept)   Weight_kg 
  273.85249    14.76166

The ratio of the new to the old coefficients is the unit change, as expected from the fact that $\betavhat = (\X^\trans \X)^{-1} \Y$.

coefficients(lm_joule_fit) / coefficients(lm_lin_fit)

(Intercept)   Weight_kg 
       4184        4184

joule_per_kcal

[1] 4184

More careful thought

Here’s a chain of reasoning that leads to a different regression.

Different animals’ density is approximately constant
- $\Rightarrow$ Animal weight in kg $\propto$ Animal volume in $m^3$
Different animals’ shape is approximately the same
- $\Rightarrow$ Animal surface area in $m^2$ $\propto$ Animal volume in $(m^3)^{2/3}$ $\propto$ Animal kg$^{2/3}$
Metabolic rate is proportional to surface area
- First law of thermodynamics
- All generated heat must be radiated
- Rate of radiation is proportional to surface area
- $\Rightarrow$ Animal metabolic rate in Joules $\propto$ Animal surface area in $m^2$

Testable hypothesis: Animal metabolic rate in Joules $\propto$ Animal kg$^{2/3}$.

Testing the hypothesis

Testable hypothesis: Animal metabolic rate in Joules $\propto$ Animal kg$^{2/3}$.

Question: How can we test this with regression?

What’s wrong with doing a regressor transform and regressing $\y_n \sim \x_n^{2/3}$, where $\y_n$ is metabolism and $\x_n$ is weight?

Testing the hypothesis

Testable hypothesis: Animal metabolic rate in Joules $\propto$ Animal kg$^{2/3}$.

A better idea:

\[ \begin{aligned} \yhat_n ={} \beta_0 \x_n^{\beta_1} \quad\quad\Leftrightarrow\quad\quad \log \yhat_n = \log \beta_0 + \beta_1 \log \x_n. \end{aligned} \]

Let’s regress $\log \y_n \sim \log \x_n$, and see whether the coefficient is $\betahat_1 \approx 2/3$.

Note that the errors we’re trying to minimize mean something different! Compare

\[ \begin{aligned} \y_n =& \gamma_0 + \gamma_1 \x_n^{2/3} + \eta_n \end{aligned} \]

versus

\[ \begin{aligned} \log \y_n =& \beta_0 + \beta_1 \x_n + \res_n \quad \Rightarrow\\ \y_n ={}& \exp(\beta_0) \x_n^{\beta_1} \exp(\res_n). \end{aligned} \]

Minimizing $\sumn \eta_n^2$ is very different from minimizing $\sumn \res_n^2$.

Log fit

lm_log_fit <- lm(log10(Metabol_joule_per_day) ~ 1 + log10(Weight_kg), kleiber_comp_df)

Compare the two fits on their respective scales

What about our hypothesis?

summary(lm_log_fit)$coefficients

                  Estimate  Std. Error   t value     Pr(>|t|)
(Intercept)      5.4466814 0.013652364 398.95519 2.222861e-47
log10(Weight_kg) 0.7564294 0.009055443  83.53312 4.246511e-31

Note that

\[ 0.7564294 \ne 2 / 3 \approx 0.666. \]

We haven’t talked about standard errors yet, but note that

\[ 0.7564294 - 0.666 = 0.0897627, \]

which is very large relative to the reported “standard error.”

(We will revisit this example later to investigate the assumptions behind the standard error, and find that they’re not likely to apply in this case.)

In fact, “Kleiber’s law” refers to the relationship

\[ \textrm{Metabolism} \propto \textrm{Weight}^{3/4}, \]

which appears roughly consistent with what we’ve found here. The reason for this scaling has been the subject of a lot of research since then and is beyond the scope of the present lecture.