Log Transform Interpretation

Introduction

Log transformations can be useful when a variable is very right-skewed, or multiplicative effects are desired over additive. However, interpretation can be challenging.

We are always discussing the natural log (ln), i.e. log_e .

Multiplicative vs Percent Change

Note that multiplicative changes can be expressed as percent changes and vice-versa. For example, multiplying a value by 1.3 is equivalent to increasing the value by 30%, or conversely, decreasing a value by 15% is equivalent to multiplying it by .85.

Logged Outcome

A 1 unit change in a predictor is associated with a $\exp (\hat{β})$ multiplicative change in Y, or a $(100 * (\exp (\hat{β}) - 1))$ % change in Y.

Examples:

If $\hat{β}$ is .2, a 1-unit increase in X is associated with an $\exp (. 2) \approx 1.22$ multiplicative change in Y, or a 22% increase.
If $\hat{β}$ is -.4 a 1-unit increase in X is associated with an $\exp (- .4) \approx . 67$ multiplicative change in Y, or a 33% decrease.

Theory

Assume our regression equation is

E (Y | X = x) = β_{0} + β_{1} x .

If we regress on the log of Y instead,

E (\log (Y) | X = x) = β_{0} + β_{1} x .

By Taylor expansion,

\log (E (X)) \approx E (\log (X)) .

Therefore we can write

\begin{aligned} E (Y | X = x + 1) & = exp (β_{0} + β_{1} (x + 1)) \\ = exp (β_{0} + β_{1} x + β_{1}) \\ = exp (β_{0} + β_{1} x) exp (β_{1}) \\ = E (Y | X = x) exp (β_{1}) . \end{aligned}

Example

data(mtcars)
(mod1 <- lm(log(disp) ~ drat, data = mtcars))

Call:
lm(formula = log(disp) ~ drat, data = mtcars)

Coefficients:
(Intercept)         drat
     8.2782      -0.8323

Therefore a 1-unit increase in drat is associated with an $\exp (- . 8323) \approx . 435$ multiplicative change in disp, corresponding to a 56.5% decrease.

To test this, we predict the ratio in predicted outcome with some values of drat, and that value increased by 1. Note: We exponentiate the predicted values to get them on the outcome scale.

exp(predict(mod1, newdata = data.frame(drat = 5)))/exp(predict(mod1, newdata = data.frame(drat = 4)))

        1
0.4350567

Repeat with different values of drat to show that all that matters is the change in the predictor, not its starting value.

exp(predict(mod1, newdata = data.frame(drat = 5)))/exp(predict(mod1, newdata = data.frame(drat = 4)))

        1
0.4350567

Visualization

We can visualize this relationship by basic plotting commands. First, generate predicted values. We will do this by generating an artificial X variable (drat) spaced over its range, obtaining predicted values, and then exponentiating them.

new_drat <- seq(min(mtcarsdrat),
                                             max(mtcarsdrat),
                                             length.out = 100)
                                             yhat1 <- exp(predict(mod1, newdata = data.frame(drat = new_drat)))

Take note of the call to exp() in the line defining yhat1.

Next, we can plot the best fit line, overlaying on top of the observed values.

plot(yhat1 ~ new_drat, type = "l")
with(mtcars, points(disp ~ drat))

Plot of disp versus drat from the model where disp (the outcome) is log-transformed

Logged Predictor

A k% change in a predictor is associated with $\hat{β} \log (1 + \frac{k}{100})$ change in the outcome.

Examples:

If $\hat{β}$ is 2, a 10% increase in X is associated with a $2 \log (1 + \frac{10}{100}) = 2 \log (1.1) \approx 0.19$ increase in Y.
If $\hat{β}$ is -1.51, a 5% increase in X is associated with a $- 1.5 \log (1 + \frac{5}{100}) = - 1.5 \log (1.05) \approx . 03$ decrease in Y.
If $\hat{β}$ is .75, a 5% decrease in X is associated with a $. 75 \log (1 + \frac{- 15}{100}) = . 75 \log (. 85) \approx . 05$ decrease in Y.

Theory

Assume our regression equation is

E (Y | X = x) = β_{0} + β_{1} x .

If we include $\log (x)$ instead, we have

E (Y | X = x) = β_{0} + β_{1} \log (x) .

Consider when $X = c x$ where $c$ is some constant (e.g. 2 for a doubling of X or 1.3 for a 30% increase in X).

E (Y | X = c x) = β_{0} + β_{1} \log (c x) .

Therefore if we look at the difference in expectation,

\begin{aligned} E (Y | X = c x) - E (Y | X = x) & = β_{1} (\log (c x) - \log (x)) \\ = β_{1} \log (c) . \end{aligned}

Approximation

If your percent change is small (a few percent) then you can approximate the change. This is because $\log (1 + x) \approx x$ when $x$ is close to 0. So to approximate what effect a 1% change in X would have, simply multiply $\hat{β}$ by that value: $0.1 \hat{β}$ . This works reliably well up to $\pm 3 %$ , moderately up to $\pm 5 %$ and gets much worse beyond that.

Example

data(mtcars)
(mod2 <- lm(disp ~ log(drat), data = mtcars))

Call:
lm(formula = drat ~ log(disp), data = mtcars)

Coefficients:
(Intercept)    log(disp)
     7.2301      -0.6875

Therefore a 25% increase in disp is associated with a $- 0.688 \log (1.25) = - 0.153$ change in drat.

To test this, we predict the difference in predicted outcome with some values of disp, and that value increased by25%.

predict(mod2, newdata = data.frame(disp = 5)) - predict(mod2, newdata = data.frame(disp = 5*1.25))

        1
0.1534182

predict(mod2, newdata = data.frame(disp = 11)) - predict(mod2, newdata = data.frame(disp = 11*1.25))

        1
0.1534182

Visualization

We'll do a similar plot, except we'll let R handle the logging and exponentiating. We'll re-use new_drat created in the earlier example.

yhat2 <- predict(mod2, newdata = data.frame(drat = new_drat))

plot(yhat2 ~ new_drat, type = "l")
with(mtcars, points(disp ~ drat))

Plot of disp versus drat from the model where drat (the predictor) is log-transformed

Both Logged

A k% change in a predictor is associated with a ${(1 + \frac{k}{100})}^{\hat{β}}$ multiplicative change in the outcome.

Examples:

If $\hat{β}$ is 2, a 10% increase in X is associated with a ${(1 + \frac{10}{100})}^{2} = 1 . 1^{2} \approx 1.21 = 21 %$ increase in Y.
If $\hat{β}$ is -1.5, a 20% decrease in X is associated with a ${(1 + \frac{- 20}{100})}^{- 1.5} = . 8^{- 1.5} \approx 1.40 = 40 %$ increase in Y.

Theory

To-do.

Example

data(mtcars)
(mod3 <- lm(log(disp) ~ log(drat), data = mtcars))

Call:
lm(formula = log(drat) ~ log(disp), data = mtcars)

Coefficients:
(Intercept)    log(disp)
     2.2763      -0.1905

Therefore a 25% increase in disp is associated with a ${1.25}^{- . 1905} = 0.958$ multiplicative change in drat, corresponding to a 4.2% decrease.

To test this, we predict the difference in predicted outcome with some values of disp, and that value increased by 25%.

predict(mod3, newdata = data.frame(disp = 5)) - predict(mod3, newdata = data.frame(disp = 5*1.25))

         1
0.04251857

predict(mod3, newdata = data.frame(disp = 8)) - predict(mod3, newdata = data.frame(disp = 8*1.25))

         1
0.04251857

Visualization

Again, a similar plot, letting R handle the log in the predictor, and we'll manually exponentiate the outcome. Again, re-use new_drat created in the earlier example.

yhat3 <- exp(predict(mod3, newdata = data.frame(drat = new_drat)))

plot(yhat3 ~ new_drat, type = "l")
with(mtcars, points(disp ~ drat))

Plot of disp versus drat from the model where both predictor and outcome are log-transformed

Compare Visualizations

The three plots can look very similar, let's plot them simultaneously just to show that they are in fact three different curves.

with(mtcars, plot(disp ~ drat))
lines(yhat1 ~ new_drat, col = "red")
lines(yhat2 ~ new_drat, col = "blue")
lines(yhat3 ~ new_drat, col = "green")

Plot of disp versus drat with all three curves