3 Random slopes
So far all we’ve talked about are random intercepts. This is by far the most common form of mixed effects regression models. Recall that we set up the theory by allowing each group to have its own intercept which we don’t estimate. We can also allow each group to have it’s own slope which we don’t estimate. Just as random intercepts are akin to including a fixed effect allowing each group to have it’s own fixed effect, random slopes are akin to interacting a variable with the grouping variable, allowing each group to have it’s own relationship.
We would include a random slope in the model if, instead of the relationship between a predictor and the outcome when controlling for group membership, we were interested in the average relationship between the predictor and the outcome across groups. For example, if we had our basic class example,
class: mixed gpa familyincome ||
In this model, the coefficient on familyincome
would estimate the relationship between family income and GPA, removing any additional class-level differences. (Class-level differences here might be that a class that randomly has lower average income has a better teacher.)
If we include a random slope, we add the variable after the :
in the second equation.
class: familyincome mixed gpa familyincome ||
Now the model would allow each classroom to have it’s own relationship between family income and GPA (which, just like random intercepts, is not actually estimated) and the coefficient on familyincome
would represent the average of those relationships.
Do you need a random slope? It depends on your theory. If your grouping variable is a nuisance and you’re simply controlling for it (as it is in most cases), you probably don’t need a random slope. If, on the other hand, you suspect there are substantial differences between groups and you’re really interested in the average of those differences, then you should.
For reference, I’d say conservatively 95% of the mixed models I fit are in situations where a random slope is not needed, and 75% of the time when people ask me if they need a random slope, the answer is no. However, my general bias is towards simpler models.1
3.1 Fitting a random slope
Let’s add a random slope for gender.
. mixed qol age agebelow52 ageabove82 i.socialclass female || household: female
Performing EM optimization ...
gradient-based optimization:
Performing not concave)
Iteration 0: Log likelihood = -18119.073 (
Iteration 1: Log likelihood = -17948.342
Iteration 2: Log likelihood = -17943.72
Iteration 3: Log likelihood = -17943.715
Iteration 4: Log likelihood = -17943.715
Computing standard errors ...
effects ML regression Number of obs = 5,179
Mixed-variable: household Number of groups = 3,995
Group group:
Obs per min = 1
avg = 1.3max = 3
chi2(9) = 129.85
Wald chi2 = 0.0000
Log likelihood = -17943.715 Prob >
------------------------------------------------------------------------------
qol | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
age | -.0091432 .0139003 -0.66 0.511 -.0363872 .0181008
agebelow52 | -.3693919 .5323218 -0.69 0.488 -1.412723 .6739396
ageabove82 | -1.205023 .9809622 -1.23 0.219 -3.127673 .7176278
|
socialclass |
Manageria.. | .2311012 .5075454 0.46 0.649 -.7636694 1.225872
Non-Manual | -1.107475 .5090861 -2.18 0.030 -2.105265 -.1096846
Skilled | -1.807631 .5325279 -3.39 0.001 -2.851366 -.763895d | -2.980058 .5455297 -5.46 0.000 -4.049276 -1.910839
Semi-skil~
Unskilled | -3.576474 .7302271 -4.90 0.000 -5.007693 -2.145255
|
female | .5655497 .2081084 2.72 0.007 .1576647 .9734348_cons | 44.72468 1.018609 43.91 0.000 42.72825 46.72112
------------------------------------------------------------------------------
------------------------------------------------------------------------------effects parameters | Estimate Std. err. [95% conf. interval]
Random-
-----------------------------+------------------------------------------------
household: Independent |var(female) | 4.050416 2.356897 1.294774 12.67084
var(_cons) | 22.7137 1.789205 19.46421 26.50567
-----------------------------+------------------------------------------------var(Residual) | 36.96956 1.960264 33.32041 41.01835
------------------------------------------------------------------------------test vs. linear model: chi2(2) = 149.11 Prob > chi2 = 0.0000
LR
test is conservative and provided only for reference. Note: LR
Most of the output seems very familiar. The only addition is the “var(female)” in the Random-effects Parameters table which, just like in random intercepts, estimates the variance across all the random slopes. Here it is very non-zero, so improves model fit. However, none of the fixed effects really change. The only difference is in the interpretation of the coefficient on female:
In the random intercepts model, the coefficient on female represented that females were on average that much higher than males, regardless of age, social class, or inter-household variance.
In this model with the random slope as well, the coefficient on female represents the average across all households of the amount that females are above males, regardless of age or social class.
3.2 Do you need to include the fixed slope if you have the random slopes
Yes.
In almost every model.
This is very similar to excluding the intercept (\(\beta_0\)) in a model - this forces the slope to pass through (0,0). In some very rare situations that might be appropriate, but extremely rarely.
Excluding the fixed slope when including random slopes forces the average of all random slopes to be 0. If the true random slope is far from zero, this will have catastrophic effects, including reversing the signs on a good number of the random slopes.
If you’re curious, my rational is I’d rather fit a simpler model that misses a nuanced complexity, then fit a more complicated model that has takes a substantial power hit and potentially is drastically further from the “truth”.↩︎