Log Transform Interpretation
Introduction
Log transformations can be useful when a variable is very right-skewed, or multiplicative effects are desired over additive. However, interpretation can be challenging.
We are always discussing the natural log (\(ln\)), i.e. \(\log_e()\).
Multiplicative vs Percent Change
Note that multiplicative changes can be expressed as percent changes and vice-versa. For example, multiplying a value by \(1.3\) is equivalent to increasing the value by \(30\%\), or conversly, decreasing a value by \(15\%\) is equivalent to multiplying it by \(.85\).
Logged Outcome
A \(1\) unit change in a predictor is associated with a \(\exp\left(\hat{\beta}\right)\) multiplicative change in \(y\), or a \(\left(100\left(\exp\left(\hat{\beta}\right) - 1\right)\%\right)\) change in \(Y\).
Examples:
- If \(\hat{\beta}\) is \(.2\), a \(1\)-unit increase in \(X\) is associated with an \(\exp\left(.2\right) \approx 1.22\) multiplicative change in \(Y\), or a \(22\%\) increase.
- If \(\hat{\beta}\) is \(-.4\) a \(1\)-unit increase in \(X\) is associated with an \(\exp\left(-.4\right) \approx .67\) multiplicative change in \(Y\), or a \(33\%\) decrease.
Theory
Assume our regression equation is
\[ E(Y|X = x) = \beta_0 + \beta_1x. \]If we regress on the log of \(Y\) instead,
\[ E(\log(Y)|X = x) = \beta_0 + \beta_1x. \]By Taylor expansion,
\[ \log(E(X)) \approx E(\log(X)) \]Therefore we can write
\begin{align*} E(Y|X = x + 1) & = \textrm{exp}\left(\beta_0 + \beta_1(x + 1)\right) \\ & = \textrm{exp}\left(\beta_0 + \beta_1x + \beta_1\right) \\ & = \textrm{exp}\left(\beta_0 + \beta_1x\right)\textrm{exp}(\beta_1) \\ & = E(Y|X = x)\textrm{exp}(\beta_1) \end{align*}Example
data(mtcars) (m <- lm(log(disp) ~ drat, data = mtcars))
Call: lm(formula = log(disp) ~ drat, data = mtcars) Coefficients: (Intercept) drat 8.2782 -0.8323
Therefore a \(1\)-unit increase in drat
is associated
with an \(\exp(-.8323) \approx .435\) multiplicative change
in disp
, corresponding to a \(56.5\%\) decrease.
To test this, we predict the ratio in predicted outcome with some
values of drat
, and that value increased by
\(1\). Note: We exponentiate the predicted values
to get them on the outcome scale.
exp(predict(m, newdata = data.frame(drat = 5)))/exp(predict(m, newdata = data.frame(drat = 4)))
1 0.4350567
Repeat with different values of drat
to show that all
that matters is the change in the predictor, not its starting value.
exp(predict(m, newdata = data.frame(drat = 18)))/exp(predict(m, newdata = data.frame(drat = 17)))
1 0.4350567
Logged Predictor
A \(k\%\) change in a predictor is associated with \(\hat{\beta}\log\left(1 + \frac{k}{100}\right)\) change in the outcome.
Examples:
- If \(\hat{\beta}\) is \(2\), a \(10\%\) increase in \(X\) is associated with a \(2\log\left(1 + \frac{10}{100}\right) = 2\log(1.1) \approx 0.19\) increase in \(Y\).
- If \(\hat{\beta}\) is \(-1.51\), a \(5\%\) increase in \(X\) is associated with a \(-1.5\log\left(1 + \frac{5}{100}\right) = -1.5\log(1.05) \approx 0.03\) decrease in \(Y\).
- If \(\hat{\beta}\) is \(.75\), a \(5\%\) decrease in \(X\) is associated with a \(.75\log\left(1 + \frac{-15}{100}\right) = .75\log(.85) \approx 0.05\) decrease in \(Y\).
Theory
Assume our regression equation is
\[ E(Y|X = x) = \beta_0 + \beta_1x. \]If we include \(\log(X)\) instead, we have
\[ E(Y|X = x) = \beta_0 + \beta_1\log(x). \]Consider when \(X = cX\) where \(c\) is some constant (e.g. \(2\) for a doubling of \(X\) or \(1.3\) for a \(30\%\) increase in \(X\)).
\[ E(Y|X = cx) = \beta_0 + \beta_1\log(cx). \]Therefore if we look at the difference in expectation,
\begin{align*} E(Y|X = cx) - E(Y|X = x) & = \beta_1(\log(cx) - \log(x)) \\ & = \beta_1\log(c). \end{align*}Approximation
If your percent change is small (a few percent) then you can approximate the change. This is because \(\log(1 + x) \approx x\) when \(x\) is close to \(0\). So to approximate what effect a \(1\%\) change in \(X\) would have, simply multiple \(\hat{\beta}\) by that value: \(0.1\hat{\beta}\). This works reliably well up to \(\pm3\%\), moderately up to \(\pm5\%\) and gets much worse beyond that.
Example
data(mtcars) (m <- lm(disp ~ log(drat), data = mtcars))
Call: lm(formula = drat ~ log(disp), data = mtcars) Coefficients: (Intercept) log(disp) 7.2301 -0.6875
Therefore a \(25\%\) increase in disp
is associated with a
\(-0.688\log(1.25) = -0.153\) change in drat
.
To test this, we predict the difference in predicted outcome with some
values of disp
, and that value increased by\(25\%\).
predict(m, newdata = data.frame(disp = 5)) - predict(m, newdata = data.frame(disp = 5*1.25))
1 0.1534182
predict(m, newdata = data.frame(disp = 11)) - predict(m, newdata = data.frame(disp = 11*1.25))
1 0.1534182
Both Logged
A \(k\%\) change in a predictor is associated with a \(\left(1 + \frac{k}{100}\right)^{\hat{\beta}}\) multiplicative change in the outcome.
Examples:
- If \(\hat{\beta}\) is \(2\), a \(10\%\) increase in \(X\) is associated with a \(\left(1 + \frac{10}{100}\right)^2 = 1.1^2 \approx 1.21 = 21\%\) increase in \(Y\).
- If \(\hat{\beta}\) is \(-1.5\), a \(20\%\) decrease in \(X\) is associated with a \(\left(1 + \frac{-20}{100}\right)^{-1.5} = .8^{-1.5} \approx 1.40 = 40\%\) increase in \(Y\).
Theory
To-do.
Example
data(mtcars) (m <- lm(log(disp) ~ log(drat), data = mtcars))
Call: lm(formula = log(drat) ~ log(disp), data = mtcars) Coefficients: (Intercept) log(disp) 2.2763 -0.1905
Therefore a \(25\%\) increase in disp
is associated
with a \(1.25^{-.1905} = 0.958\) multiplicative change
in drat
, corresponding to a \(4.2\%\) decrease.
To test this, we predict the difference in predicted outcome with
some values of disp
, and that value increaed by
\(25\%\).
predict(m, newdata = data.frame(disp = 5)) - predict(m, newdata = data.frame(disp = 5*1.25))
1 0.04251857
predict(m, newdata = data.frame(disp = 8)) - predict(m, newdata = data.frame(disp = 8*1.25))
1 0.04251857