Introduction

Column

Log Transformations

Log transformations can be useful when a variable is very right-skewed, or multiplicative effects are desired over additive. However, interpretation can be challenging.

Note that we are always discussing the natural log, ln, that is log base e.

Column

Multiplicative vs Percent Change

Note that multiplicative changes can be expressed as percent changes and vice-versa.

If we multiply \(X\) by 1.1, the resultant \(1.1X\) is 10% larger than \(X\). E.g. 16.5 is 10% larger than 15.

If we multiply \(X\) by .7, the resultant \(.7X\) is 30% lower than \(X\). E.g. 7 is 30% smaller than 10.

Log-transformed Response

Column

Discussion

tldr

A 1 unit change in a predictor is associated with a \(\textrm{exp}(\hat{\beta})\) mulitplicative change in \(Y\), or a \(100(1 - \textrm{exp}(\hat{\beta})\%\) change in \(Y\).

Examples:

  • If \(\hat{\beta}\) is .2, a 1 unit increase in \(X\) is associated with a \(\textrm{exp}(.2)\ \approx 1.22\) multiplicative change in \(Y\), or a 22% increase.
  • If \(\hat{\beta}\) is -.4, a 1 unit increase in \(X\) is associated with a a \(\textrm{exp}(-.4)\ \approx .67\) multiplicative change in \(Y\), or a 33% decrease.

Theory

Assume our regression equation is

\[ E(Y|X = x) = \beta_0 + \beta_1x. \]

If we regress on the log of \(Y\) instead,

\[ E(\log(Y)|X = x) = \beta_0 + \beta_1x. \]

By Taylor expansion,

\[ \log(E(X)) \approx E(\log(X)). \]

Therefore we can write \[\begin{align*} E(Y|X = x + 1) & = \textrm{exp}\left(\beta_0 + \beta_1(x + 1)\right) \\ & = \textrm{exp}\left(\beta_0 + \beta_1x + \beta_1\right) \\ & = \textrm{exp}\left(\beta_0 + \beta_1x\right)\textrm{exp}(\beta_1) \\ & = E(Y|X = x)\textrm{exp}(\beta_1) \end{align*}\]

Column

Example

data(mtcars)
(m <- lm(log(disp) ~ drat, data = mtcars))

Call:
lm(formula = log(disp) ~ drat, data = mtcars)

Coefficients:
(Intercept)         drat  
     8.2782      -0.8323  

Therefore a 1-unit increase in disp is associated with a \(\textrm{exp}(-0.832) = 0.435\) multiplicative change in drat.

To test this, we predict the ratio in predicted outcome with some values of disp, and that value increased by 1. Note: We exponentiate the predicted values to get them on the outcome scale.

exp(predict(m, newdata = data.frame(drat = 5)))/exp(predict(m, newdata = data.frame(drat = 4)))
        1 
0.4350567 
exp(predict(m, newdata = data.frame(drat = 30)))/exp(predict(m, newdata = data.frame(drat = 29)))
        1 
0.4350567 

Log-transformed Predictor

Column

Discussion

tldr

A \(k\%\) change in a predictor is associated with \(\hat{\beta}\log\left(1 + \frac{k}{100}\right)\) change in the outcome.

Examples:

  • If \(\hat{\beta}\) is 2, a \(10\%\) increase in \(X\) is associated with a \(2\log\left(1 + \frac{10}{100}\right) = 2\log(1.1) \approx 0.19\) increaes in \(Y\).
  • If \(\hat{\beta}\) is -1.5, a \(20\%\) decrease in \(X\) is associated with a \(-1.5\log\left(1 + \frac{-20}{100}\right) = -1.5\log(.8) \approx 0.15\) decrease in \(Y\).

Theory

Assume our regression equation is

\[ E(Y|X = x) = \beta_0 + \beta_1x. \]

If we include \(\log(X)\) instead, we have

\[ E(Y|X = x) = \beta_0 + \beta_1\log(x). \]

Consider when \(X = cX\) where \(c\) is some constant (e.g. 2 for a doubling of \(X\) or 1.3 for a 30% increase in \(X\)).

\[ E(Y|X = cx) = \beta_0 + \beta_1\log(cx). \]

Therefore if we look at the difference in expectation,

\[ E(Y|X = cx) - E(Y|X = x) = \beta_1(\log(cx) - \log(x)) = \beta_1\log(c). \]

Approximation

If your percent change is small (e.g. a few percent) then you can approximate the change. This is because

\[ log(1 + x) \approx x, \]

when x is close to 0. So to approximate what effect a 1% change in X would have, simply multiple \(\hat{\beta}\) by that value; \(0.1\hat{\beta}\). This works reliably well up to \(\pm3\%\), moderately up to \(\pm5\%\) and gets much worse beyond that.

Column

Example

data(mtcars)
(m <- lm(drat ~ log(disp), data = mtcars))

Call:
lm(formula = drat ~ log(disp), data = mtcars)

Coefficients:
(Intercept)    log(disp)  
     7.2301      -0.6875  

Therefore a 25% increase in disp is associated with a \(-0.688\log(1.25) = -0.153\) change in drat.

To test this, we predict the difference in predicted outcome with some values of disp, and that value increaed by 25%.
predict(m, newdata = data.frame(disp = 5)) - predict(m, newdata = data.frame(disp = 5*1.25))
        1 
0.1534182 
predict(m, newdata = data.frame(disp = 30)) - predict(m, newdata = data.frame(disp = 30*1.25))
        1 
0.1534182