# Log Transform Interpretation

## Introduction

Log transformations can be useful when a variable is very right-skewed, or multiplicative effects are desired over additive. However, interpretation can be challenging.

We are always discussing the natural log (ln), i.e. loge .

### Multiplicative vs Percent Change

Note that multiplicative changes can be expressed as percent changes and vice-versa. For example, multiplying a value by 1.3 is equivalent to increasing the value by 30%, or conversely, decreasing a value by 15% is equivalent to multiplying it by .85.

## Logged Outcome

A 1 unit change in a predictor is associated with a $\mathrm{exp}\left(\stackrel{^}{\beta }\right)$ multiplicative change in Y, or a $\left(100*\left(\mathrm{exp}\left(\stackrel{^}{\beta }\right)-1\right)\right)$% change in Y.

Examples:

• If $\stackrel{^}{\beta }$ is .2, a 1-unit increase in X is associated with an $\mathrm{exp}\left(.2\right)\approx 1.22$ multiplicative change in Y, or a 22% increase.
• If $\stackrel{^}{\beta }$ is -.4 a 1-unit increase in X is associated with an $\mathrm{exp}\left(-.4\right)\approx .67$ multiplicative change in Y, or a 33% decrease.

### Theory

Assume our regression equation is

$E ( Y | X = x ) = β 0 + β 1 x .$

If we regress on the log of Y instead,

$E ( log ⁡ ( Y ) | X = x ) = β 0 + β 1 x .$ $log ⁡ ( E ( X ) ) ≈ E ( log ⁡ ( X ) ) .$

Therefore we can write

$E ( Y | X = x + 1 ) = exp ( β 0 + β 1 ( x + 1 ) ) = exp ( β 0 + β 1 x + β 1 ) = exp ( β 0 + β 1 x ) exp ( β 1 ) = E ( Y | X = x ) exp ( β 1 ) .$

### Example

``````data(mtcars)
(mod1 <- lm(log(disp) ~ drat, data = mtcars))``````
```Call:
lm(formula = log(disp) ~ drat, data = mtcars)

Coefficients:
(Intercept)         drat
8.2782      -0.8323```

Therefore a 1-unit increase in `drat` is associated with an $\mathrm{exp}\left(-.8323\right)\approx .435$ multiplicative change in `disp`, corresponding to a 56.5% decrease.

To test this, we predict the ratio in predicted outcome with some values of `drat`, and that value increased by 1. Note: We exponentiate the predicted values to get them on the outcome scale.

``exp(predict(mod1, newdata = data.frame(drat = 5)))/exp(predict(mod1, newdata = data.frame(drat = 4)))``
``````        1
0.4350567``````

Repeat with different values of `drat` to show that all that matters is the change in the predictor, not its starting value.

``exp(predict(mod1, newdata = data.frame(drat = 5)))/exp(predict(mod1, newdata = data.frame(drat = 4)))``
``````        1
0.4350567``````

### Visualization

We can visualize this relationship by basic plotting commands. First, generate predicted values. We will do this by generating an artificial X variable (`drat`) spaced over its range, obtaining predicted values, and then exponentiating them.

``````new_drat <- seq(min(mtcarsdrat),
max(mtcarsdrat),
length.out = 100)
yhat1 <- exp(predict(mod1, newdata = data.frame(drat = new_drat)))``````

Take note of the call to `exp()` in the line defining `yhat1`.

Next, we can plot the best fit line, overlaying on top of the observed values.

``````plot(yhat1 ~ new_drat, type = "l")
with(mtcars, points(disp ~ drat))``````

## Logged Predictor

A k% change in a predictor is associated with $\stackrel{^}{\beta }\mathrm{log}\left(1+\frac{k}{100}\right)$ change in the outcome.

Examples:

• If $\stackrel{^}{\beta }$ is 2, a 10% increase in X is associated with a $2\phantom{\rule{0.1667em}{0ex}}\mathrm{log}\left(1+\frac{10}{100}\right)=2\phantom{\rule{0.1667em}{0ex}}\mathrm{log}\left(1.1\right)\approx 0.19$ increase in Y.
• If $\stackrel{^}{\beta }$ is -1.51, a 5% increase in X is associated with a $-1.5\phantom{\rule{0.1667em}{0ex}}\mathrm{log}\left(1+\frac{5}{100}\right)=-1.5\phantom{\rule{0.1667em}{0ex}}\mathrm{log}\left(1.05\right)\approx .03$ decrease in Y.
• If $\stackrel{^}{\beta }$ is .75, a 5% decrease in X is associated with a $.75\phantom{\rule{0.1667em}{0ex}}\mathrm{log}\left(1+\frac{-15}{100}\right)=.75\phantom{\rule{0.1667em}{0ex}}\mathrm{log}\left(.85\right)\approx .05$ decrease in Y.

### Theory

Assume our regression equation is

$E ( Y | X = x ) = β 0 + β 1 x .$

If we include $\mathrm{log}\left(x\right)$ instead, we have

$E ( Y | X = x ) = β 0 + β 1 log ⁡ ( x ) .$

Consider when $X=cx$ where $c$ is some constant (e.g. 2 for a doubling of X or 1.3 for a 30% increase in X).

$E ( Y | X = c x ) = β 0 + β 1 log ⁡ ( c x ) .$

Therefore if we look at the difference in expectation,

$E ( Y | X = c x ) − E ( Y | X = x ) = β 1 ( log ⁡ ( c x ) − log ⁡ ( x ) ) = β 1 log ⁡ ( c ) .$

#### Approximation

If your percent change is small (a few percent) then you can approximate the change. This is because $\mathrm{log}\left(1+x\right)\approx x$ when $x$ is close to 0. So to approximate what effect a 1% change in X would have, simply multiply $\stackrel{^}{\beta }$ by that value: $0.1\stackrel{^}{\beta }$ . This works reliably well up to $±3%$ , moderately up to $±5%$ and gets much worse beyond that.

### Example

``````data(mtcars)
(mod2 <- lm(disp ~ log(drat), data = mtcars))``````
``````Call:
lm(formula = drat ~ log(disp), data = mtcars)

Coefficients:
(Intercept)    log(disp)
7.2301      -0.6875``````

Therefore a 25% increase in `disp` is associated with a $-0.688\phantom{\rule{0.1667em}{0ex}}\mathrm{log}\left(1.25\right)=-0.153$ change in `drat`.

To test this, we predict the difference in predicted outcome with some values of `disp`, and that value increased by25%.

``predict(mod2, newdata = data.frame(disp = 5)) - predict(mod2, newdata = data.frame(disp = 5*1.25))``
```        1
0.1534182```
``predict(mod2, newdata = data.frame(disp = 11)) - predict(mod2, newdata = data.frame(disp = 11*1.25))``
```        1
0.1534182```

### Visualization

We'll do a similar plot, except we'll let R handle the logging and exponentiating. We'll re-use `new_drat` created in the earlier example.

``yhat2 <- predict(mod2, newdata = data.frame(drat = new_drat))``
``````plot(yhat2 ~ new_drat, type = "l")
with(mtcars, points(disp ~ drat))``````

## Both Logged

A k% change in a predictor is associated with a ${\left(1+\frac{k}{100}\right)}^{\stackrel{^}{\beta }}$ multiplicative change in the outcome.

Examples:

• If $\stackrel{^}{\beta }$ is 2, a 10% increase in X is associated with a ${\left(1+\frac{10}{100}\right)}^{2}=1.{1}^{2}\approx 1.21=21%$ increase in Y.
• If $\stackrel{^}{\beta }$ is -1.5, a 20% decrease in X is associated with a ${\left(1+\frac{-20}{100}\right)}^{-1.5}=.{8}^{-1.5}\approx 1.40=40%$ increase in Y.

To-do.

### Example

```data(mtcars)
(mod3 <- lm(log(disp) ~ log(drat), data = mtcars))```
```Call:
lm(formula = log(drat) ~ log(disp), data = mtcars)

Coefficients:
(Intercept)    log(disp)
2.2763      -0.1905```

Therefore a 25% increase in `disp` is associated with a ${1.25}^{-.1905}=0.958$ multiplicative change in `drat`, corresponding to a 4.2% decrease.

To test this, we predict the difference in predicted outcome with some values of `disp`, and that value increased by 25%.

`predict(mod3, newdata = data.frame(disp = 5)) - predict(mod3, newdata = data.frame(disp = 5*1.25))`
```         1
0.04251857```
`predict(mod3, newdata = data.frame(disp = 8)) - predict(mod3, newdata = data.frame(disp = 8*1.25))`
```         1
0.04251857```

### Visualization

Again, a similar plot, letting R handle the log in the predictor, and we'll manually exponentiate the outcome. Again, re-use `new_drat` created in the earlier example.

``yhat3 <- exp(predict(mod3, newdata = data.frame(drat = new_drat)))``
``````plot(yhat3 ~ new_drat, type = "l")
with(mtcars, points(disp ~ drat))``````

## Compare Visualizations

The three plots can look very similar, let's plot them simultaneously just to show that they are in fact three different curves.

``````with(mtcars, plot(disp ~ drat))
lines(yhat1 ~ new_drat, col = "red")
lines(yhat2 ~ new_drat, col = "blue")
lines(yhat3 ~ new_drat, col = "green")``````