Log Transform Interpretation
Introduction
Log transformations can be useful when a variable is very right-skewed, or multiplicative effects are desired over additive. However, interpretation can be challenging.
We are always discussing the natural log (\(ln\)), i.e. \(\log_e()\).
Multiplicative vs Percent Change
Note that multiplicative changes can be expressed as percent changes and vice-versa. For example, multiplying a value by \(1.3\) is equivalent to increasing the value by \(30\%\), or conversely, decreasing a value by \(15\%\) is equivalent to multiplying it by \(.85\).
Logged Outcome
A \(1\) unit change in a predictor is associated with a \(\exp\left(\hat{\beta}\right)\) multiplicative change in \(y\), or a \(\left(100\left(\exp\left(\hat{\beta}\right) - 1\right)\%\right)\) change in \(Y\).
Examples:
- If \(\hat{\beta}\) is \(.2\), a \(1\)-unit increase in \(X\) is associated with an \(\exp\left(.2\right) \approx 1.22\) multiplicative change in \(Y\), or a \(22\%\) increase.
- If \(\hat{\beta}\) is \(-.4\) a \(1\)-unit increase in \(X\) is associated with an \(\exp\left(-.4\right) \approx .67\) multiplicative change in \(Y\), or a \(33\%\) decrease.
Theory
Assume our regression equation is
\[ E(Y|X = x) = \beta_0 + \beta_1x. \]If we regress on the log of \(Y\) instead,
\[ E(\log(Y)|X = x) = \beta_0 + \beta_1x. \]By Taylor expansion,
\[ \log(E(X)) \approx E(\log(X)) \]Therefore we can write
\begin{align*} E(Y|X = x + 1) & = \textrm{exp}\left(\beta_0 + \beta_1(x + 1)\right) \\ & = \textrm{exp}\left(\beta_0 + \beta_1x + \beta_1\right) \\ & = \textrm{exp}\left(\beta_0 + \beta_1x\right)\textrm{exp}(\beta_1) \\ & = E(Y|X = x)\textrm{exp}(\beta_1) \end{align*}Example
data(mtcars)
(mod1 <- lm(log(disp) ~ drat, data = mtcars))
Call: lm(formula = log(disp) ~ drat, data = mtcars) Coefficients: (Intercept) drat 8.2782 -0.8323
Therefore a \(1\)-unit increase in drat
is associated
with an \(\exp(-.8323) \approx .435\) multiplicative change
in disp
, corresponding to a \(56.5\%\) decrease.
To test this, we predict the ratio in predicted outcome with some
values of drat
, and that value increased by
\(1\). Note: We exponentiate the predicted values
to get them on the outcome scale.
exp(predict(mod1, newdata = data.frame(drat = 5)))/exp(predict(mod1, newdata = data.frame(drat = 4)))
1
0.4350567
Repeat with different values of drat
to show that all
that matters is the change in the predictor, not its starting value.
exp(predict(mod1, newdata = data.frame(drat = 5)))/exp(predict(mod1, newdata = data.frame(drat = 4)))
1
0.4350567
Visualization
We can visualize this relationship by basic plotting commands. First,
generate predicted values. We will do this by generating an artificial
X variable (drat
) spaced over its range, obtaining
predicted values, and then exponentiating them.
new_drat <- seq(min(mtcars$drat),
max(mtcars$drat),
length.out = 100)
yhat1 <- exp(predict(mod1, newdata = data.frame(drat = new_drat)))
Take note of the call to exp()
in the line defining
yhat1
.
Next, we can plot the best fit line, overlaying on top of the observed values.
plot(yhat1 ~ new_drat, type = "l")
with(mtcars, points(disp ~ drat))
Logged Predictor
A \(k\%\) change in a predictor is associated with \(\hat{\beta}\log\left(1 + \frac{k}{100}\right)\) change in the outcome.
Examples:
- If \(\hat{\beta}\) is \(2\), a \(10\%\) increase in \(X\) is associated with a \(2\log\left(1 + \frac{10}{100}\right) = 2\log(1.1) \approx 0.19\) increase in \(Y\).
- If \(\hat{\beta}\) is \(-1.51\), a \(5\%\) increase in \(X\) is associated with a \(-1.5\log\left(1 + \frac{5}{100}\right) = -1.5\log(1.05) \approx 0.03\) decrease in \(Y\).
- If \(\hat{\beta}\) is \(.75\), a \(5\%\) decrease in \(X\) is associated with a \(.75\log\left(1 + \frac{-15}{100}\right) = .75\log(.85) \approx 0.05\) decrease in \(Y\).
Theory
Assume our regression equation is
\[ E(Y|X = x) = \beta_0 + \beta_1x. \]If we include \(\log(X)\) instead, we have
\[ E(Y|X = x) = \beta_0 + \beta_1\log(x). \]Consider when \(X = cX\) where \(c\) is some constant (e.g. \(2\) for a doubling of \(X\) or \(1.3\) for a \(30\%\) increase in \(X\)).
\[ E(Y|X = cx) = \beta_0 + \beta_1\log(cx). \]Therefore if we look at the difference in expectation,
\begin{align*} E(Y|X = cx) - E(Y|X = x) & = \beta_1(\log(cx) - \log(x)) \\ & = \beta_1\log(c). \end{align*}Approximation
If your percent change is small (a few percent) then you can approximate the change. This is because \(\log(1 + x) \approx x\) when \(x\) is close to \(0\). So to approximate what effect a \(1\%\) change in \(X\) would have, simply multiple \(\hat{\beta}\) by that value: \(0.1\hat{\beta}\). This works reliably well up to \(\pm3\%\), moderately up to \(\pm5\%\) and gets much worse beyond that.
Example
data(mtcars)
(mod2 <- lm(disp ~ log(drat), data = mtcars))
Call:
lm(formula = drat ~ log(disp), data = mtcars)
Coefficients:
(Intercept) log(disp)
7.2301 -0.6875
Therefore a \(25\%\) increase in disp
is associated with a
\(-0.688\log(1.25) = -0.153\) change in drat
.
To test this, we predict the difference in predicted outcome with some
values of disp
, and that value increased by\(25\%\).
predict(mod2, newdata = data.frame(disp = 5)) - predict(mod2, newdata = data.frame(disp = 5*1.25))
1 0.1534182
predict(mod2, newdata = data.frame(disp = 11)) - predict(mod2, newdata = data.frame(disp = 11*1.25))
1 0.1534182
Visualization
We'll do a similar plot, except we'll let R handle the logging and
exponentiating. We'll re-use new_drat
created in the
earlier example.
yhat2 <- predict(mod2, newdata = data.frame(drat = new_drat))
plot(yhat2 ~ new_drat, type = "l")
with(mtcars, points(disp ~ drat))
Both Logged
A \(k\%\) change in a predictor is associated with a \(\left(1 + \frac{k}{100}\right)^{\hat{\beta}}\) multiplicative change in the outcome.
Examples:
- If \(\hat{\beta}\) is \(2\), a \(10\%\) increase in \(X\) is associated with a \(\left(1 + \frac{10}{100}\right)^2 = 1.1^2 \approx 1.21 = 21\%\) increase in \(Y\).
- If \(\hat{\beta}\) is \(-1.5\), a \(20\%\) decrease in \(X\) is associated with a \(\left(1 + \frac{-20}{100}\right)^{-1.5} = .8^{-1.5} \approx 1.40 = 40\%\) increase in \(Y\).
Theory
To-do.
Example
data(mtcars) (mod3 <- lm(log(disp) ~ log(drat), data = mtcars))
Call: lm(formula = log(drat) ~ log(disp), data = mtcars) Coefficients: (Intercept) log(disp) 2.2763 -0.1905
Therefore a \(25\%\) increase in disp
is associated
with a \(1.25^{-.1905} = 0.958\) multiplicative change
in drat
, corresponding to a \(4.2\%\) decrease.
To test this, we predict the difference in predicted outcome with
some values of disp
, and that value increased by
\(25\%\).
predict(mod3, newdata = data.frame(disp = 5)) - predict(mod3, newdata = data.frame(disp = 5*1.25))
1 0.04251857
predict(mod3, newdata = data.frame(disp = 8)) - predict(mod3, newdata = data.frame(disp = 8*1.25))
1 0.04251857
Visualization
Again, a similar plot, letting R handle the log in the predictor,
and we'll manually exponentiate the outcome. Again, re-use
new_drat
created in the
earlier example.
yhat3 <- exp(predict(mod3, newdata = data.frame(drat = new_drat)))
plot(yhat3 ~ new_drat, type = "l")
with(mtcars, points(disp ~ drat))
Compare Visualizations
The three plots can look very similar, let's plot them simultaneously just to show that they are in fact three different curves.
with(mtcars, plot(disp ~ drat))
lines(yhat1 ~ new_drat, col = "red")
lines(yhat2 ~ new_drat, col = "blue")
lines(yhat3 ~ new_drat, col = "green")