Introduction

Column

Marginal Effects

After running a regression model of some sort, the common way of interpreting the relationships between predictors and the outcome is by interpreting the regression coefficients. In many situations these interpretations are straightforward, however, in some settings the interpretations can be tricky, or the coefficients can simply not be interpreted.

These complications arise most commonly when a model involves an interaction or moderation. In these sort of models, interpretation of the coefficients requires special care due to the relationship between the various interacted predictors, and the set of information obtainable from the coefficients is often limited without resorting to significant algebra.

Marginal effects offer an improvement over simple coefficient interpretation. Marginal means allow you to estimate the average response under a set of conditions; e.g. the average response for each racial group, or the average response when age and weight are at certain values.

Marginal slopes estimate the slope between a predictor and the outcome when other variables are at some specific value; e.g. the average slope on age within each gender, or the average slope on age when weight is at certain values.

In either case, in addition to estimating these marginal effects, we ;can easily test hypotheses of equality between them via pairwise comparisons. For example, is the average response different by ethnicity, or is the slope on age different between genders.

If you are familiar with interpreting regression coefficients, and specifically interactions, you may have some idea of how to start addressing all the above examples. However, to complete answer these research questions would require more than simply a table of regression coefficients. With marginal effects, one additional set of results can address all these questions.

Finally, marginal effect estimation is a step towards creating informative visualizations of relationships such as interaction plots.

Data and Libraries

Data

For all examples, we’ll use the “margex” data set from Stata. This data set was designed for examples using margins and contains a nice mix of continuous and categorical variables.

The primary outcome variable will be “y”, a continuous variable.

The predictors we’ll be using include “group”, a categorical variable with 3 levels; “sex”, a binary variable; and “age”, a continuous variable.

Libraries

In Stata, everything needed is built-in.

For R, I’ve implemented the marginal effects estimates using the emmeans, marginaleffects, and ggeffects packages. You’ll see each in it’s own tab on the right. In addition, there is an additional package, interactions, which we use for plotting.

Code

Stata

The webuse command can load the directly.

. webuse margex
(Artificial data for margins)

The marginaleffects package in R doesn’t like a variable being named “group”, so we rename it.

. rename group class

. summarize, sep(0)

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
           y |      3,000    69.73357    21.53986          0      146.3
     outcome |      3,000    .1696667    .3754023          0          1
         sex |      3,000    .5006667    .5000829          0          1
       class |      3,000       1.828    .7732714          1          3
         age |      3,000      39.799    11.54174         20         60
    distance |      3,000    58.58566    181.3987        .25   748.8927
         ycn |      3,000    74.47263    22.01264          0      153.8
          yc |      3,000    .2413333    .4279633          0          1
   treatment |      3,000    .5006667    .5000829          0          1
    agegroup |      3,000    2.263667    .8316913          1          3
         arm |      3,000    1.970667    .7899457          1          3

R Data

The haven package is used to read the data from Stata into R.

library(haven)
m <- read_dta("http://www.stata-press.com/data/r18/margex.dta")
m <- data.frame(m)
m$sex <- as_factor(m$sex)
names(m)[names(m) == "group"] <- "class"
m$class <- as_factor(m$class)
head(m)

      y outcome    sex class age distance   ycn yc treatment agegroup arm
1 102.6       1 Female     1  55    11.75  97.3  1         1        3   1
2  77.2       0 Female     1  60    30.75  78.6  0         1        3   1
3  80.5       0 Female     1  27    14.25  54.5  0         1        1   1
4  82.5       1 Female     1  60    29.25  98.4  1         1        3   1
5  71.6       1 Female     1  52    19.00 105.3  1         1        3   1
6  83.2       1 Female     1  49     2.50  89.7  0         1        3   1

The marginaleffects package doesn’t like a variable being named “group”, so we rename it.

R Packages

There are several packages which can be used to estimate marginal means. So far, this document implements emmeans and ggeffects, and I’ve begun writing up marginaleffects.

emmeans is based on an older package, lsmeans, and is more flexible and powerful. However, it’s syntax is slightly strange, and there are often several ways to obtain similar results which may or may not be the same. Additionally, while it’s immediate output is clear, accessing the raw data producing the output takes some doing.

ggeffects is a package designed to tie into the dplyr/ggplot universe. It’s results are always a clean data frame which can be manipulated or passed straight into ggplot. However, ggeffects can only estimate marginal means, not marginal slopes or pairwise comparisons. (If I’m wrong about that last statement, please let me know.)

marginaleffects is a newer package which has a much more robust series of estimates. It focuses on observation-level predictions which can then be averaged, rather than estimating marginal effects directly. Theres often multiple ways to obtain the same result which can be very frustrating. It’s a very early version ( r packageVersion("marginaleffects") at time of last updating this document) so it may see breaking changes.

The marginaleffects website has a good write-up comparing between these packages (as well as Stata). This is obviously written with bias towards being pro-marginaleffects so keep that in mind while reading it.

In addition, the interactions package is used to plot interaction effects.

emmeans

library(emmeans)

ggeffects

The ggeffects package requires several other packages to work fully. However, it does not follow proper package framework and install them for you, rather it requires you to ensure you have the following packages installed as well:

effects
emmeans
ggplot2
datawizard

library(ggeffects)
library(effects)
library(emmeans)
library(ggplot2)
library(datawizard)

If you do not have those packages loaded, ggeffects will load them on demand (rather than at the time of loading ggeffects). If you do not have a required package installed, ggeffects will instead produce a warning, so keep an eye out for those and install packages as needed.

marginaleffects

library(marginaleffects)

(Using marginaleffects remains a work-in-progress.)

interactions

library(interactions)

Software Versions

This document was last compiled using Stata/SE 18.0 and R 4.3.3.

R package versions at last compilation:

Package	Version
emmeans	1.10.0
ggeffects	1.5.1
effects	4.2.2
datawizard	0.10.0
marginaleffects	0.18.0
ggplot2	3.5.0
interactions	1.1.5

Categorical Variables

Marginal Means for Categorical Variables

When we have a single categorical variable (not involved in an interaction) in a model, one level is excluded as a reference or baseline category, and the coefficients on the other levels represent the average difference between those levels and the reference.

The marginal means for each level of the categorical can be calcuated to estimate the average response for that level.

To obtain this estimate, we assume that every observation belongs to the same group, then predict their outcome in the model as if they were in that group, and their other predictors were at the observed level. Finally, average over all observations.

Example

The regression model has a single predictor, “group”, which is the aforementioned 3-level categorical variable. Group 1 is excluded as the reference category, so the intercept (_cons in Stata) estimates the predicted average response in group 1 of 68.4. The coefficients on group 2 and group 3 reflect the difference the predicted average response per group, so the predicted average response in group 2 is 68.4 + .4 = 68.8. Correspondingly, the average response for group 3 is 68.4 + 5.4 = 73.8.

Notice that the margin means gives us the same predicted average response per group, along with standard errors. (The standard error for group 1’s estimate is the same as the intercept, but the standard error for the two other groups would require additional work.)

The regression output does contain some additional information; namely the coefficients on group 2 and group 3 test the hypotheses respectively that group 1 and group 2 are equivalent, and that group 1 and group 3 are equivalent; neither of which are provided by the marginal means. However, pairwise comparisons can obtain these easily, as well as the comparison between groups 2 and 3 which is not provided by the regression output.

Code

Stata

. regress y i.class

      Source |       SS           df       MS      Number of obs   =     3,000
-------------+----------------------------------   F(2, 2997)      =     15.48
       Model |  14229.0349         2  7114.51743   Prob > F        =    0.0000
    Residual |  1377203.97     2,997  459.527518   R-squared       =    0.0102
-------------+----------------------------------   Adj R-squared   =    0.0096
       Total |  1391433.01     2,999  463.965657   Root MSE        =    21.437

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       class |
          2  |   .4084991   .8912269     0.46   0.647    -1.338979    2.155977
          3  |   5.373138   1.027651     5.23   0.000     3.358165     7.38811
             |
       _cons |   68.35805   .6190791   110.42   0.000     67.14419    69.57191
------------------------------------------------------------------------------

. margins class

Adjusted predictions                                     Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       class |
          1  |   68.35805   .6190791   110.42   0.000     67.14419    69.57191
          2  |   68.76655   .6411134   107.26   0.000     67.50948    70.02361
          3  |   73.73119   .8202484    89.89   0.000     72.12288    75.33949
------------------------------------------------------------------------------

R (emmeans)

summary(mod1 <- lm(y ~ class, data = m))


Call:
lm(formula = y ~ class, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-73.531 -14.758   0.101  14.536  72.569 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  68.3580     0.6191 110.419  < 2e-16 ***
class2        0.4085     0.8912   0.458    0.647    
class3        5.3731     1.0277   5.229 1.83e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 21.44 on 2997 degrees of freedom
Multiple R-squared:  0.01023,   Adjusted R-squared:  0.009566 
F-statistic: 15.48 on 2 and 2997 DF,  p-value: 2.045e-07

emmeans(mod1, ~ class)

 class emmean    SE   df lower.CL upper.CL
 1       68.4 0.619 2997     67.1     69.6
 2       68.8 0.641 2997     67.5     70.0
 3       73.7 0.820 2997     72.1     75.3

Confidence level used: 0.95

Note that by default, emmeans returns confidence intervals. If you prefer p-values, you can wrap the emmeans call in test:

test(emmeans(mod1, ~ class))

 class emmean    SE   df t.ratio p.value
 1       68.4 0.619 2997 110.419  <.0001
 2       68.8 0.641 2997 107.261  <.0001
 3       73.7 0.820 2997  89.889  <.0001

R (ggeffects)

summary(mod1 <- lm(y ~ class, data = m))


Call:
lm(formula = y ~ class, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-73.531 -14.758   0.101  14.536  72.569 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  68.3580     0.6191 110.419  < 2e-16 ***
class2        0.4085     0.8912   0.458    0.647    
class3        5.3731     1.0277   5.229 1.83e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 21.44 on 2997 degrees of freedom
Multiple R-squared:  0.01023,   Adjusted R-squared:  0.009566 
F-statistic: 15.48 on 2 and 2997 DF,  p-value: 2.045e-07

ggeffect(mod1, "class")

# Predicted values of y

class | Predicted |       95% CI
--------------------------------
1     |     68.36 | 67.14, 69.57
2     |     68.77 | 67.51, 70.02
3     |     73.73 | 72.12, 75.34

Note that we could have run that as ggeffect(mod1, ~ class), with a formula. However, when we later want to specify certain values of the predictors, we need to use the character version, so I stick with it here.

R (marginaleffects)

summary(mod1 <- lm(y ~ class, data = m))


Call:
lm(formula = y ~ class, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-73.531 -14.758   0.101  14.536  72.569 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  68.3580     0.6191 110.419  < 2e-16 ***
class2        0.4085     0.8912   0.458    0.647    
class3        5.3731     1.0277   5.229 1.83e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 21.44 on 2997 degrees of freedom
Multiple R-squared:  0.01023,   Adjusted R-squared:  0.009566 
F-statistic: 15.48 on 2 and 2997 DF,  p-value: 2.045e-07

predictions(mod1, newdata = datagrid(class = unique))


 class Estimate Std. Error     z Pr(>|z|)   S 2.5 % 97.5 %
     1     68.4      0.619 110.4   <0.001 Inf  67.1   69.6
     2     68.8      0.641 107.3   <0.001 Inf  67.5   70.0
     3     73.7      0.820  89.9   <0.001 Inf  72.1   75.3

Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, y, class 
Type:  response

The datagrid() function can also take in values. For example,

predictions(mod1, newdata = datagrid(class = c(1, 2, 3)))


 class Estimate Std. Error     z Pr(>|z|)   S 2.5 % 97.5 %
     1     68.4      0.619 110.4   <0.001 Inf  67.1   69.6
     2     68.8      0.641 107.3   <0.001 Inf  67.5   70.0
     3     73.7      0.820  89.9   <0.001 Inf  72.1   75.3

Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, y, class 
Type:  response

is an alternative to class = unique to specify the levels. (NOTE: These are the “levels” not the values - it just happens that class has the same. See below for what happens when it differs.) We can specify a subset of levels as well.

predictions(mod1, newdata = datagrid(class = c(1, 2)))


 class Estimate Std. Error   z Pr(>|z|)   S 2.5 % 97.5 %
     1     68.4      0.619 110   <0.001 Inf  67.1   69.6
     2     68.8      0.641 107   <0.001 Inf  67.5   70.0

Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, y, class 
Type:  response

Factor labels

If your factor has labels which are not just the values, you need to refer to those.

mod2 <- lm(y ~ sex, data = m)
str(m$sex)

 Factor w/ 2 levels "Male","Female": 2 2 2 2 2 2 2 2 2 2 ...
 - attr(*, "label")= chr "Sex"

predictions(mod2, newdata = datagrid(sex = c(1,2)))

Error: The "sex" element of the `at` list corresponds to a factor variable. The values entered in the `at` list must be one of the factor levels: "Male", "Female".

predictions(mod2, newdata = datagrid(sex = c("Male", "Female")))


    sex Estimate Std. Error   z Pr(>|z|)   S 2.5 % 97.5 %
 Male       64.6       0.54 119   <0.001 Inf  63.5   65.6
 Female     74.9       0.54 139   <0.001 Inf  73.8   75.9

Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, y, sex 
Type:  response

Math

Assume a linear regression set up with a single categorical variable, \(G\), with three groups. We fit

\[ E(Y|G) = \beta_0 + \beta_1g_2 + \beta_2g_3, \]

where \(g_2\) and \(g_3\) are dummy variables representing membership in groups 2 and 3 respectively (\(g_{2i} = 1\) if observation \(i\) has \(G = 2\), and equal to 0 otherwise.) Since \(g_1\) is not found in the model, it is the reference category.

Therefore, \(\beta_0\) represents the average response among the reference category \(G = 1\). \(\beta_1\) represents the difference in the average response between groups \(G = 1\) and \(G = 2\). Therefore, \(\beta_0 + \beta_1\) is the average response in group \(G = 2\). A similar argument can be made about \(\beta_2\) and group 3.

Interactions between Categorical Variables

When we have an interaction of categorical variables, we can estimate the marginal mean in each combination of levels from the two categorical variables. In other words, think of it as a single categorical variable describing all combinations and it operates similarly - set all observations to be a particular combination of levels, predict the outcome using their observed covariates, and average.

Example

Since this is not that different that a single categorical variable, everything carries over - the intercept in the regression model represents the reference category (group 1 males) for a predicted average response of 50.6. The coefficients on group 2 and 3 represent the difference between the predicted average response for group 1 versus groups 2 and 3 respectively, among males. The coefficient on female represents the difference between the predicted average response for males and females, among group 1. The coefficients on the interactions represent additional conditional comparisons. We can use basic arithmetic to estimate the predicted average responses in other groups; for example females in group 2 are estimated by the sum of all coefficients in female and group 2 added to the intercept - so the female coefficient, the group 2 coefficient, and the female by group 2 interaction, for 50.6 + 21.6 + 11.4 + (-4.9) = 78.7. Looking at the margins output, we see this corresponds.

The discussion about standard errors and hypothesis testing carries forward as well.

Code

Stata

. regress y i.sex##i.class

      Source |       SS           df       MS      Number of obs   =     3,000
-------------+----------------------------------   F(5, 2994)      =     92.91
       Model |  186898.199         5  37379.6398   Prob > F        =    0.0000
    Residual |  1204534.81     2,994  402.316235   R-squared       =    0.1343
-------------+----------------------------------   Adj R-squared   =    0.1329
       Total |  1391433.01     2,999  463.965657   Root MSE        =    20.058

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         sex |
     Female  |   21.62507   1.509999    14.32   0.000     18.66433    24.58581
             |
       class |
          2  |   11.37444   1.573314     7.23   0.000     8.289552    14.45932
          3  |    21.6188   1.588487    13.61   0.000     18.50416    24.73344
             |
   sex#class |
   Female#2  |  -4.851582   1.942744    -2.50   0.013     -8.66083   -1.042333
   Female#3  |  -6.084875   3.004638    -2.03   0.043    -11.97624   -.1935115
             |
       _cons |    50.6107   1.367932    37.00   0.000     47.92852    53.29288
------------------------------------------------------------------------------

. margins sex#class

Adjusted predictions                                     Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
   sex#class |
     Male#1  |    50.6107   1.367932    37.00   0.000     47.92852    53.29288
     Male#2  |   61.98514   .7772248    79.75   0.000     60.46119    63.50908
     Male#3  |    72.2295   .8074975    89.45   0.000     70.64619     73.8128
   Female#1  |   72.23577     .63942   112.97   0.000     70.98203    73.48952
   Female#2  |   78.75863   .9434406    83.48   0.000     76.90877    80.60849
   Female#3  |    87.7697   2.468947    35.55   0.000     82.92869     92.6107
------------------------------------------------------------------------------

R (emmeans)

summary(mod1 <- lm(y ~ sex*class, data = m))


Call:
lm(formula = y ~ sex * class, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-72.029 -13.285   0.464  13.741  67.964 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)        50.611      1.368  36.998  < 2e-16 ***
sexFemale          21.625      1.510  14.321  < 2e-16 ***
class2             11.374      1.573   7.230 6.12e-13 ***
class3             21.619      1.588  13.610  < 2e-16 ***
sexFemale:class2   -4.852      1.943  -2.497   0.0126 *  
sexFemale:class3   -6.085      3.005  -2.025   0.0429 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 20.06 on 2994 degrees of freedom
Multiple R-squared:  0.1343,    Adjusted R-squared:  0.1329 
F-statistic: 92.91 on 5 and 2994 DF,  p-value: < 2.2e-16

emmeans(mod1, ~ class + sex)

 class sex    emmean    SE   df lower.CL upper.CL
 1     Male     50.6 1.368 2994     47.9     53.3
 2     Male     62.0 0.777 2994     60.5     63.5
 3     Male     72.2 0.807 2994     70.6     73.8
 1     Female   72.2 0.639 2994     71.0     73.5
 2     Female   78.8 0.943 2994     76.9     80.6
 3     Female   87.8 2.469 2994     82.9     92.6

Confidence level used: 0.95

test(emmeans(mod1, ~ class + sex))

 class sex    emmean    SE   df t.ratio p.value
 1     Male     50.6 1.368 2994  36.998  <.0001
 2     Male     62.0 0.777 2994  79.752  <.0001
 3     Male     72.2 0.807 2994  89.449  <.0001
 1     Female   72.2 0.639 2994 112.971  <.0001
 2     Female   78.8 0.943 2994  83.480  <.0001
 3     Female   87.8 2.469 2994  35.549  <.0001

R (ggeffects)

summary(mod1 <- lm(y ~ sex*class, data = m))


Call:
lm(formula = y ~ sex * class, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-72.029 -13.285   0.464  13.741  67.964 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)        50.611      1.368  36.998  < 2e-16 ***
sexFemale          21.625      1.510  14.321  < 2e-16 ***
class2             11.374      1.573   7.230 6.12e-13 ***
class3             21.619      1.588  13.610  < 2e-16 ***
sexFemale:class2   -4.852      1.943  -2.497   0.0126 *  
sexFemale:class3   -6.085      3.005  -2.025   0.0429 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 20.06 on 2994 degrees of freedom
Multiple R-squared:  0.1343,    Adjusted R-squared:  0.1329 
F-statistic: 92.91 on 5 and 2994 DF,  p-value: < 2.2e-16

ggeffect(mod1, c("class", "sex"))

# Predicted values of y

sex: Male

class | Predicted |       95% CI
--------------------------------
1     |     50.61 | 47.93, 53.29
2     |     61.99 | 60.46, 63.51
3     |     72.23 | 70.65, 73.81

sex: Female

class | Predicted |       95% CI
--------------------------------
1     |     72.24 | 70.98, 73.49
2     |     78.76 | 76.91, 80.61
3     |     87.77 | 82.93, 92.61

R (marginaleffects)

summary(mod1 <- lm(y ~ sex*class, data = m))


Call:
lm(formula = y ~ sex * class, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-72.029 -13.285   0.464  13.741  67.964 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)        50.611      1.368  36.998  < 2e-16 ***
sexFemale          21.625      1.510  14.321  < 2e-16 ***
class2             11.374      1.573   7.230 6.12e-13 ***
class3             21.619      1.588  13.610  < 2e-16 ***
sexFemale:class2   -4.852      1.943  -2.497   0.0126 *  
sexFemale:class3   -6.085      3.005  -2.025   0.0429 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 20.06 on 2994 degrees of freedom
Multiple R-squared:  0.1343,    Adjusted R-squared:  0.1329 
F-statistic: 92.91 on 5 and 2994 DF,  p-value: < 2.2e-16

predictions(mod1, newdata = datagrid(sex = unique, class = unique))


    sex class Estimate Std. Error     z Pr(>|z|)     S 2.5 % 97.5 %
 Male       1     50.6      1.368  37.0   <0.001 993.0  47.9   53.3
 Male       2     62.0      0.777  79.8   <0.001   Inf  60.5   63.5
 Male       3     72.2      0.807  89.4   <0.001   Inf  70.6   73.8
 Female     1     72.2      0.639 113.0   <0.001   Inf  71.0   73.5
 Female     2     78.8      0.943  83.5   <0.001   Inf  76.9   80.6
 Female     3     87.8      2.469  35.5   <0.001 917.1  82.9   92.6

Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, y, sex, class 
Type:  response

Again, we can specify specific levels in the datagrid:

predictions(mod1, newdata = datagrid(sex = "Male", class = c(1, 3)))


  sex class Estimate Std. Error    z Pr(>|z|)     S 2.5 % 97.5 %
 Male     1     50.6      1.368 37.0   <0.001 993.0  47.9   53.3
 Male     3     72.2      0.807 89.4   <0.001   Inf  70.6   73.8

Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, y, sex, class 
Type:  response

Math

Consider the simplest case, with two binary variables \(Z\) and \(K\). We fit the model

\[ E(Y|Z,K) = \beta_0 + \beta_1Z + \beta_2K + \beta_3ZK, \]

where \(ZK = 1\) only if both \(Z = 1\) and \(K = 1\).

This is functionally equivalent to defining a new variable \(L\),

\[ L = \begin{cases} 0, & K = 0\ \&\ Z = 0 \\ 1, & K = 0\ \&\ Z = 1 \\ 2, & K = 1\ \&\ Z = 0 \\ 3, & K = 1\ \&\ Z = 1, \end{cases} \]

and fitting the model

\[ E(Y|L) = \beta_0 + \beta_1l_1 + \beta_2l_2 + \beta_3l_3. \]

Continuous Variables

Marginal Means for Continuous Variables

Marginal means for continuous variables actually operate the same as for categorical variables, in the sense that they are setting all observations to a specific value of the continuous variable, then all predicted values are averaged.

Typically we obtain the marginal means over several meaningful values of the categorical variable to glimpse at the relationship. Plotting is extremely useful and always recommended when marginal effects for continuous variables are desired.

We can of course obtain marginal means with any number of variables fixed; in the example to the right we first obtain marginal means when “age” is at certain values, then when “age” is at certain values and within each group.

Code

Stata

. regress y i.sex c.age c.distance

      Source |       SS           df       MS      Number of obs   =     3,000
-------------+----------------------------------   F(3, 2996)      =    143.27
       Model |  174576.269         3  58192.0896   Prob > F        =    0.0000
    Residual |  1216856.74     2,996   406.16046   R-squared       =    0.1255
-------------+----------------------------------   Adj R-squared   =    0.1246
       Total |  1391433.01     2,999  463.965657   Root MSE        =    20.153

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         sex |
     Female  |   13.98753   .7768541    18.01   0.000     12.46431    15.51075
         age |  -.5038264   .0336538   -14.97   0.000    -.5698133   -.4378394
    distance |  -.0060725   .0020291    -2.99   0.003    -.0100511   -.0020939
       _cons |   83.13802   1.327228    62.64   0.000     80.53565    85.74039
------------------------------------------------------------------------------

. margins, at(age = (30 40))

Predictive margins                                       Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()
1._at: age = 30
2._at: age = 40

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   74.67056   .4941029   151.12   0.000     73.70175    75.63938
          2  |    69.6323   .3680117   189.21   0.000     68.91072    70.35388
------------------------------------------------------------------------------

. margins, at(age = (30 40) distance = (10 100))

Predictive margins                                       Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()
1._at: age      =  30
       distance =  10
2._at: age      =  30
       distance = 100
3._at: age      =  40
       distance =  10
4._at: age      =  40
       distance = 100

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         _at |
          1  |    74.9656   .5034959   148.89   0.000     73.97837    75.95283
          2  |   74.41907   .5014946   148.39   0.000     73.43576    75.40238
          3  |   69.92733   .3809974   183.54   0.000     69.18029    70.67438
          4  |   69.38081   .3774763   183.80   0.000     68.64067    70.12095
------------------------------------------------------------------------------

. margins sex, at(age = (30 40))

Predictive margins                                       Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()
1._at: age = 30
2._at: age = 40

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     _at#sex |
     1#Male  |   67.66747   .5597763   120.88   0.000     66.56989    68.76506
   1#Female  |     81.655   .6902601   118.30   0.000     80.30157    83.00843
     2#Male  |   62.62921   .5370234   116.62   0.000     61.57624    63.68218
   2#Female  |   76.61674   .5331296   143.71   0.000      75.5714    77.66208
------------------------------------------------------------------------------

R (emmeans)

Note that any variable specified in the at option must also appear in the formula. Any variables you place in the formula but not at will be examined at each unique level (so don’t put a continuous variable there!).

summary(mod1 <- lm(y ~ sex + age + distance, data = m))


Call:
lm(formula = y ~ sex + age + distance, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-72.356 -13.792  -0.275  13.624  74.950 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 83.138023   1.327228  62.640  < 2e-16 ***
sexFemale   13.987531   0.776854  18.005  < 2e-16 ***
age         -0.503826   0.033654 -14.971  < 2e-16 ***
distance    -0.006073   0.002029  -2.993  0.00279 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 20.15 on 2996 degrees of freedom
Multiple R-squared:  0.1255,    Adjusted R-squared:  0.1246 
F-statistic: 143.3 on 3 and 2996 DF,  p-value: < 2.2e-16

emmeans(mod1, ~ age, at = list(age = c(30, 40)))

 age emmean    SE   df lower.CL upper.CL
  30   74.7 0.494 2996     73.7     75.6
  40   69.6 0.368 2996     68.9     70.3

Results are averaged over the levels of: sex 
Confidence level used: 0.95

test(emmeans(mod1, ~ age, at = list(age = c(30, 40))))

 age emmean    SE   df t.ratio p.value
  30   74.7 0.494 2996 151.138  <.0001
  40   69.6 0.368 2996 189.185  <.0001

Results are averaged over the levels of: sex

emmeans(mod1, ~ age + distance, at = list(age = c(30, 40), distance = c(10,100)))

 age distance emmean    SE   df lower.CL upper.CL
  30       10   75.0 0.503 2996     74.0     75.9
  40       10   69.9 0.381 2996     69.2     70.7
  30      100   74.4 0.501 2996     73.4     75.4
  40      100   69.4 0.377 2996     68.6     70.1

Results are averaged over the levels of: sex 
Confidence level used: 0.95

emmeans(mod1, ~ age + sex, at = list(age = c(30, 40)))

 age sex    emmean    SE   df lower.CL upper.CL
  30 Male     67.7 0.560 2996     66.6     68.8
  40 Male     62.6 0.537 2996     61.6     63.7
  30 Female   81.7 0.690 2996     80.3     83.0
  40 Female   76.6 0.533 2996     75.6     77.7

Confidence level used: 0.95

R (ggeffects)

Note the particular syntax for specifying particular values - a quoted string containing the name of the variable, followed by a list in square brackets of comma-separated values (a single value would be just [4]). Whitespace is ignored.

summary(mod1 <- lm(y ~ sex + age + distance, data = m))


Call:
lm(formula = y ~ sex + age + distance, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-72.356 -13.792  -0.275  13.624  74.950 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 83.138023   1.327228  62.640  < 2e-16 ***
sexFemale   13.987531   0.776854  18.005  < 2e-16 ***
age         -0.503826   0.033654 -14.971  < 2e-16 ***
distance    -0.006073   0.002029  -2.993  0.00279 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 20.15 on 2996 degrees of freedom
Multiple R-squared:  0.1255,    Adjusted R-squared:  0.1246 
F-statistic: 143.3 on 3 and 2996 DF,  p-value: < 2.2e-16

ggeffect(mod1, "age [30, 40]")

# Predicted values of y

age | Predicted |       95% CI
------------------------------
 30 |     74.67 | 73.70, 75.64
 40 |     69.63 | 68.91, 70.35

ggeffect(mod1, c("age [30, 40]", "distance [10, 100]"))

# Predicted values of y

distance: 10

age | Predicted |       95% CI
------------------------------
 30 |     74.97 | 73.98, 75.95
 40 |     69.93 | 69.18, 70.67

distance: 100

age | Predicted |       95% CI
------------------------------
 30 |     74.42 | 73.44, 75.40
 40 |     69.38 | 68.64, 70.12

ggeffect(mod1, c("sex", "age [30, 40]"))

# Predicted values of y

age: 30

sex    | Predicted |       95% CI
---------------------------------
Male   |     67.67 | 66.57, 68.77
Female |     81.66 | 80.30, 83.01

age: 40

sex    | Predicted |       95% CI
---------------------------------
Male   |     62.63 | 61.58, 63.68
Female |     76.62 | 75.57, 77.66

R (marginaleffects)

summary(mod1 <- lm(y ~ sex + age + distance, data = m))


Call:
lm(formula = y ~ sex + age + distance, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-72.356 -13.792  -0.275  13.624  74.950 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 83.138023   1.327228  62.640  < 2e-16 ***
sexFemale   13.987531   0.776854  18.005  < 2e-16 ***
age         -0.503826   0.033654 -14.971  < 2e-16 ***
distance    -0.006073   0.002029  -2.993  0.00279 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 20.15 on 2996 degrees of freedom
Multiple R-squared:  0.1255,    Adjusted R-squared:  0.1246 
F-statistic: 143.3 on 3 and 2996 DF,  p-value: < 2.2e-16

predictions(mod1, by = "age", newdata = datagridcf(age = c(30, 40)))


 age Estimate Std. Error   z Pr(>|z|)   S 2.5 % 97.5 %
  30     74.7      0.494 151   <0.001 Inf  73.7   75.6
  40     69.6      0.368 189   <0.001 Inf  68.9   70.4

Columns: age, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
Type:  response

Note the inclusion of age twice, once in the by= argument and once inside the newdata= argument. Excluding either of those will produce unexpected results.

Also, the datagridcf() function produces a counter-factual set of datasets similar to how Stata works. Using datagrid() here would not work.

predictions(mod1, by = c("age", "distance"),
            newdata = datagridcf(age = c(30, 40),
                                 distance = c(10, 100)))


 age distance Estimate Std. Error   z Pr(>|z|)   S 2.5 % 97.5 %
  30       10     75.0      0.504 149   <0.001 Inf  74.0   76.0
  30      100     74.4      0.501 148   <0.001 Inf  73.4   75.4
  40       10     69.9      0.381 184   <0.001 Inf  69.2   70.7
  40      100     69.4      0.377 184   <0.001 Inf  68.6   70.1

Columns: age, distance, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
Type:  response

predictions(mod1, newdata = datagrid(age = c(30, 40), sex = unique))


 age    sex Estimate Std. Error   z Pr(>|z|)   S 2.5 % 97.5 % distance
  30 Male       67.7      0.560 121   <0.001 Inf  66.6   68.8     58.6
  30 Female     81.7      0.690 118   <0.001 Inf  80.3   83.0     58.6
  40 Male       62.6      0.537 117   <0.001 Inf  61.6   63.7     58.6
  40 Female     76.6      0.533 144   <0.001 Inf  75.6   77.7     58.6

Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, y, distance, age, sex 
Type:  response

Math

Consider a model with a single continuous predictor, plus a control variable (which may be continuous or categorical).

\[ E(Y|X, Z) = \beta_0 + \beta_1X + \beta_2Z. \]

For any given value of \(X\), it is easy to compute the marginal mean for a given individual. For example,

\[ E(Y|X = 2, Z) = \beta_0 + 2\beta_1 + \beta_2Z. \]

Therefore we can easily compute \(E(y_i|x = 2, z = z_i)\) for each individual (the predicted outcome if each individual had a value of \(X = 2\), with their other values at the observed value) and average them.

Categorical and Continuous Interactions

Code

Stata

. regress y i.sex##c.age

      Source |       SS           df       MS      Number of obs   =     3,000
-------------+----------------------------------   F(3, 2996)      =    139.91
       Model |  170983.675         3  56994.5583   Prob > F        =    0.0000
    Residual |  1220449.33     2,996   407.35959   R-squared       =    0.1229
-------------+----------------------------------   Adj R-squared   =    0.1220
       Total |  1391433.01     2,999  463.965657   Root MSE        =    20.183

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         sex |
     Female  |   14.92308   2.789012     5.35   0.000     9.454508    20.39165
         age |  -.4929608   .0480944   -10.25   0.000    -.5872622   -.3986595
             |
   sex#c.age |
     Female  |  -.0224116   .0674167    -0.33   0.740    -.1545994    .1097762
             |
       _cons |   82.36936   1.812958    45.43   0.000      78.8146    85.92413
------------------------------------------------------------------------------

. margins, at(age = (30 40))

Predictive margins                                       Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()
1._at: age = 30
2._at: age = 40

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   74.71541   .5089298   146.81   0.000     73.71752    75.71329
          2  |   69.67359   .3890273   179.10   0.000      68.9108    70.43638
------------------------------------------------------------------------------

. margins sex, at(age = (30 40))

Adjusted predictions                                     Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()
1._at: age = 30
2._at: age = 40

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     _at#sex |
     1#Male  |   67.58054   .5984013   112.94   0.000     66.40722    68.75386
   1#Female  |   81.83127   .8228617    99.45   0.000     80.21784     83.4447
     2#Male  |   62.65093   .5541361   113.06   0.000     61.56441    63.73746
   2#Female  |   76.67755   .5461908   140.39   0.000      75.6066    77.74849
------------------------------------------------------------------------------

R (emmeans)

summary(mod1 <- lm(y ~ sex*age, data = m))


Call:
lm(formula = y ~ sex * age, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-71.817 -13.848  -0.116  13.758  75.263 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)   82.36936    1.81296  45.434  < 2e-16 ***
sexFemale     14.92308    2.78901   5.351 9.42e-08 ***
age           -0.49296    0.04809 -10.250  < 2e-16 ***
sexFemale:age -0.02241    0.06742  -0.332     0.74    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 20.18 on 2996 degrees of freedom
Multiple R-squared:  0.1229,    Adjusted R-squared:  0.122 
F-statistic: 139.9 on 3 and 2996 DF,  p-value: < 2.2e-16

emmeans(mod1, ~ age, at = list(age = c(30, 40)))

 age emmean    SE   df lower.CL upper.CL
  30   74.7 0.509 2996     73.7     75.7
  40   69.7 0.389 2996     68.9     70.4

Results are averaged over the levels of: sex 
Confidence level used: 0.95

test(emmeans(mod1, ~ age, at = list(age = c(30, 40))))

 age emmean    SE   df t.ratio p.value
  30   74.7 0.509 2996 146.851  <.0001
  40   69.7 0.389 2996 179.070  <.0001

Results are averaged over the levels of: sex

test(emmeans(mod1, ~ age + sex, at = list(age = c(30, 40))))

 age sex    emmean    SE   df t.ratio p.value
  30 Male     67.6 0.598 2996 112.935  <.0001
  40 Male     62.7 0.554 2996 113.061  <.0001
  30 Female   81.8 0.823 2996  99.447  <.0001
  40 Female   76.7 0.546 2996 140.386  <.0001

R (ggeffects)

summary(mod1 <- lm(y ~ sex*age, data = m))


Call:
lm(formula = y ~ sex * age, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-71.817 -13.848  -0.116  13.758  75.263 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)   82.36936    1.81296  45.434  < 2e-16 ***
sexFemale     14.92308    2.78901   5.351 9.42e-08 ***
age           -0.49296    0.04809 -10.250  < 2e-16 ***
sexFemale:age -0.02241    0.06742  -0.332     0.74    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 20.18 on 2996 degrees of freedom
Multiple R-squared:  0.1229,    Adjusted R-squared:  0.122 
F-statistic: 139.9 on 3 and 2996 DF,  p-value: < 2.2e-16

ggeffect(mod1, "age [30, 40]")

# Predicted values of y

age | Predicted |       95% CI
------------------------------
 30 |     74.72 | 73.72, 75.71
 40 |     69.67 | 68.91, 70.44

ggeffect(mod1, c("sex", "age [30, 40]"))

# Predicted values of y

age: 30

sex    | Predicted |       95% CI
---------------------------------
Male   |     67.58 | 66.41, 68.75
Female |     81.83 | 80.22, 83.44

age: 40

sex    | Predicted |       95% CI
---------------------------------
Male   |     62.65 | 61.56, 63.74
Female |     76.68 | 75.61, 77.75

Marginal Slopes

In addition to the marginal means, we can also estimate the marginal slopes. In other words, the predicted average slope on a given predictor, when all other predictors are at their observed value. When the continuous variable whose marginal slope we estimate is not involved in an interaction, this simply recovers the estimates coefficient from the regression model. This will become more interesting when the continuous variable is involved in an interaction.

(Note that although we interpret binary variables in a particular way, mathematically they are simply continuous variables [for continuous variables we often discuss “a 1-unit increase”; for a binary variable, a 1-unit increase is simply 1 versus 0]. Therefore we can in fact obtain marginal slopes for binary predictors, which is again the estimated coefficient and represents the average predicted difference between groups.)

Example

We see in this example that the estimated coefficient on the “age” continuous variable is -0.5; the estimated marginal slope for age is the same -0.5.

Code

Stata

. regress y i.sex c.age

      Source |       SS           df       MS      Number of obs   =     3,000
-------------+----------------------------------   F(2, 2997)      =    209.88
       Model |  170938.657         2  85469.3283   Prob > F        =    0.0000
    Residual |  1220494.35     2,997  407.238688   R-squared       =    0.1229
-------------+----------------------------------   Adj R-squared   =    0.1223
       Total |  1391433.01     2,999  463.965657   Root MSE        =     20.18

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         sex |
     Female  |   14.03271   .7777377    18.04   0.000     12.50775    15.55766
         age |  -.5043666    .033698   -14.97   0.000    -.5704402   -.4382931
       _cons |   82.78115   1.323613    62.54   0.000     80.18586    85.37643
------------------------------------------------------------------------------

. margins, dydx(age)

Average marginal effects                                 Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()
dy/dx wrt:  age

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         age |  -.5043666    .033698   -14.97   0.000    -.5704402   -.4382931
------------------------------------------------------------------------------

R (emmeans)

summary(mod1 <- lm(y ~ sex + age, data = m))


Call:
lm(formula = y ~ sex + age, data = m)

Residuals:
   Min     1Q Median     3Q    Max 
-71.99 -13.88  -0.15  13.76  75.28 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  82.7811     1.3236   62.54   <2e-16 ***
sexFemale    14.0327     0.7777   18.04   <2e-16 ***
age          -0.5044     0.0337  -14.97   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 20.18 on 2997 degrees of freedom
Multiple R-squared:  0.1229,    Adjusted R-squared:  0.1223 
F-statistic: 209.9 on 2 and 2997 DF,  p-value: < 2.2e-16

emtrends(mod1, ~ 1, var = "age")

 1       age.trend     SE   df lower.CL upper.CL
 overall    -0.504 0.0337 2997    -0.57   -0.438

Results are averaged over the levels of: sex 
Confidence level used: 0.95

To attach a p-value, wrap the call in the test function:

test(emtrends(mod1, ~ 1, var = "age"))

 1       age.trend     SE   df t.ratio p.value
 overall    -0.504 0.0337 2997 -14.967  <.0001

Results are averaged over the levels of: sex

Math

The Stata margins command to obtain marginal means includes the “dydx” option, which points to derivative - and indeed, this is exactly what is computed. If we have a basic regression model,

\[ E(Y|X) = \beta_0 + \beta_1X \]

taking the derivative with respect to \(X\) will obtain \(\beta_1\), which is the estimated coefficient.

If \(X\) enters the model in a more complex way, say a polynomial term:

\[ E(Y|X) = \beta_0 + \beta_1X + \beta_2X^2 \]

now the derivative is \(\beta_1 + 2\beta_2X\). Similar to marginal means, where the predicted outcome was estimated for each individual then those outcomes averaged, here the derivative is estimated plugging in each observation’s value of \(X\), then averaged.

Categorical by continuous interaction

Marginal Slopes with Interactions

When there is an interaction between a categorical variable and a continuous variable, it is natural to want to estimate the slope at each level of the categorical variable, the marginal slope. This is very similar to the marginal mean, but instead of the average predicted outcome per group, it’s the average predicted slope per group.

As we mentioned when discussing marginal means with interactions, regression coefficients tell only a partial story when there is an interaction involved. Regression coefficients will estimate the slope in the reference group and the difference between the slope in the reference group and other groups, but not explicitly estimate the slope in each group. Marginal slopes does this.

Example

Here we interact “sex” (binary) with “age” (continuous). The regression coefficients tell us that the slope among males (the reference group) is -0.49, and the difference in slopes between groups is -0.022. The marginal mean gives us the slope amongst males again, as well as -0.49 + (-0.02) = 0.51 as the slope among females.

Code

Stata

. regress y i.sex##c.age

      Source |       SS           df       MS      Number of obs   =     3,000
-------------+----------------------------------   F(3, 2996)      =    139.91
       Model |  170983.675         3  56994.5583   Prob > F        =    0.0000
    Residual |  1220449.33     2,996   407.35959   R-squared       =    0.1229
-------------+----------------------------------   Adj R-squared   =    0.1220
       Total |  1391433.01     2,999  463.965657   Root MSE        =    20.183

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         sex |
     Female  |   14.92308   2.789012     5.35   0.000     9.454508    20.39165
         age |  -.4929608   .0480944   -10.25   0.000    -.5872622   -.3986595
             |
   sex#c.age |
     Female  |  -.0224116   .0674167    -0.33   0.740    -.1545994    .1097762
             |
       _cons |   82.36936   1.812958    45.43   0.000      78.8146    85.92413
------------------------------------------------------------------------------

. margins sex, dydx(age)

Average marginal effects                                 Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()
dy/dx wrt:  age

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
age          |
         sex |
       Male  |  -.4929608   .0480944   -10.25   0.000    -.5872622   -.3986595
     Female  |  -.5153724   .0472435   -10.91   0.000    -.6080054   -.4227395
------------------------------------------------------------------------------

R (emmeans)

summary(mod1 <- lm(y ~ sex*age, data = m))


Call:
lm(formula = y ~ sex * age, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-71.817 -13.848  -0.116  13.758  75.263 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)   82.36936    1.81296  45.434  < 2e-16 ***
sexFemale     14.92308    2.78901   5.351 9.42e-08 ***
age           -0.49296    0.04809 -10.250  < 2e-16 ***
sexFemale:age -0.02241    0.06742  -0.332     0.74    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 20.18 on 2996 degrees of freedom
Multiple R-squared:  0.1229,    Adjusted R-squared:  0.122 
F-statistic: 139.9 on 3 and 2996 DF,  p-value: < 2.2e-16

emtrends(mod1, ~ sex, var = "age")

 sex    age.trend     SE   df lower.CL upper.CL
 Male      -0.493 0.0481 2996   -0.587   -0.399
 Female    -0.515 0.0472 2996   -0.608   -0.423

Confidence level used: 0.95

test(emtrends(mod1, ~ sex, var = "age"))

 sex    age.trend     SE   df t.ratio p.value
 Male      -0.493 0.0481 2996 -10.250  <.0001
 Female    -0.515 0.0472 2996 -10.909  <.0001

Math

Let’s assume we have a binary variable, \(Z\), interacted with a continuous variable, \(X\).

\[ E(Y|X,Z) = \beta_0 + \beta_1X + \beta_2Z + \beta_3XZ \]

Here we’re combining the math from marginal effects with marginal slopes. First, we generate two equations, one for \(Z = 0\) and one for \(Z = 1\):

\[\begin{align*} E(Y|X, Z = 0) & = \beta_0 + \beta_1X \\ E(Y|X, Z = 1) & = \beta_0 + \beta_1X + \beta_2 + \beta_3X = (\beta_0 + \beta_2) + (\beta_1 + \beta_3)X. \end{align*}\]

Then when we differentiate each with respect to \(X\), we obtain \(\beta_1\) and \((\beta_1 + \beta_3)\) respectively.

Categorical Variables

Pairwise Comparisons of Categorical Variables

Estimating marginal means and slopes is very convenient, but usually we’ll require addressing a specific research question determining whether two marginal means or slopes are statistically distinguishable. We saw that marginal means for categorical variables are simple functions of the regressino coefficients, so testing for equality between groups is eqiuvalent to testing for whether the model found a difference between groups.

To test for these differences, we’ll be obtaining all pairwise differences.

Keeping track of the direction of these differences can be cumbersome; I’d recommend looking at both marginal means and pairwise comparisons to keep the directionality straight.

Example

We return to our simple model with a single 3-level predictor. In the regression model, the coefficients on group 2 and group 3 are actually exactly what we want, however, note that the comparison between groups 2 and 3 is not represented. This gap grows exponentially as the number of groups increases; with 4 groups, we see 3 of the 6 possible pairwise comparisons, and with 6 groups we see only 4 of the 15 possible pairwise comparisons.

In the pairwise comparisons output, we see the 2 vs 1 and 3 vs 1 comparisons replicated, as well as the comparison of 2 versus 3. From here we can interpret that group 3 is statistically distinguishable from the other groups.

Note that you can get seemingly contradictory results - group A is statistically distinguishable from B, B is statistically distinguishable from C, but A and C are not statistically distinguishable. These results are not incorrect, and often occur when group B’s marginal mean is midway between A and C.

Code

Stata

. regress y i.class

      Source |       SS           df       MS      Number of obs   =     3,000
-------------+----------------------------------   F(2, 2997)      =     15.48
       Model |  14229.0349         2  7114.51743   Prob > F        =    0.0000
    Residual |  1377203.97     2,997  459.527518   R-squared       =    0.0102
-------------+----------------------------------   Adj R-squared   =    0.0096
       Total |  1391433.01     2,999  463.965657   Root MSE        =    21.437

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       class |
          2  |   .4084991   .8912269     0.46   0.647    -1.338979    2.155977
          3  |   5.373138   1.027651     5.23   0.000     3.358165     7.38811
             |
       _cons |   68.35805   .6190791   110.42   0.000     67.14419    69.57191
------------------------------------------------------------------------------

. margins class, pwcompare(pv)

Pairwise comparisons of adjusted predictions             Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()

-----------------------------------------------------
             |            Delta-method    Unadjusted
             |   Contrast   std. err.      t    P>|t|
-------------+---------------------------------------
       class |
     2 vs 1  |   .4084991   .8912269     0.46   0.647
     3 vs 1  |   5.373138   1.027651     5.23   0.000
     3 vs 2  |   4.964638   1.041073     4.77   0.000
-----------------------------------------------------

R (emmeans)

summary(mod1 <- lm(y ~ class, data = m))


Call:
lm(formula = y ~ class, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-73.531 -14.758   0.101  14.536  72.569 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  68.3580     0.6191 110.419  < 2e-16 ***
class2        0.4085     0.8912   0.458    0.647    
class3        5.3731     1.0277   5.229 1.83e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 21.44 on 2997 degrees of freedom
Multiple R-squared:  0.01023,   Adjusted R-squared:  0.009566 
F-statistic: 15.48 on 2 and 2997 DF,  p-value: 2.045e-07

pairs(emmeans(mod1, ~ class), adjust = "none")

 contrast        estimate    SE   df t.ratio p.value
 class1 - class2   -0.408 0.891 2997  -0.458  0.6467
 class1 - class3   -5.373 1.028 2997  -5.229  <.0001
 class2 - class3   -4.965 1.041 2997  -4.769  <.0001

The adjust = "none" argument skips any multiple comparisons correction. This is done to obtain the same results as the regression coefficients. If you’d prefer a more ANOVA post-hoc approach, there are other options for adjust, see ?emmeans::contrast for details.

Categorical Interactions

Column

Stata

. regress y i.class##i.sex

      Source |       SS           df       MS      Number of obs   =     3,000
-------------+----------------------------------   F(5, 2994)      =     92.91
       Model |  186898.199         5  37379.6398   Prob > F        =    0.0000
    Residual |  1204534.81     2,994  402.316235   R-squared       =    0.1343
-------------+----------------------------------   Adj R-squared   =    0.1329
       Total |  1391433.01     2,999  463.965657   Root MSE        =    20.058

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       class |
          2  |   11.37444   1.573314     7.23   0.000     8.289552    14.45932
          3  |    21.6188   1.588487    13.61   0.000     18.50416    24.73344
             |
         sex |
     Female  |   21.62507   1.509999    14.32   0.000     18.66433    24.58581
             |
   class#sex |
   2#Female  |  -4.851582   1.942744    -2.50   0.013     -8.66083   -1.042333
   3#Female  |  -6.084875   3.004638    -2.03   0.043    -11.97624   -.1935115
             |
       _cons |    50.6107   1.367932    37.00   0.000     47.92852    53.29288
------------------------------------------------------------------------------

. margins class#sex, pwcompare(pv)

Pairwise comparisons of adjusted predictions             Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()

------------------------------------------------------------------
                          |            Delta-method    Unadjusted
                          |   Contrast   std. err.      t    P>|t|
--------------------------+---------------------------------------
                class#sex |
  (1#Female) vs (1#Male)  |   21.62507   1.509999    14.32   0.000
    (2#Male) vs (1#Male)  |   11.37444   1.573314     7.23   0.000
  (2#Female) vs (1#Male)  |   28.14793   1.661722    16.94   0.000
    (3#Male) vs (1#Male)  |    21.6188   1.588487    13.61   0.000
  (3#Female) vs (1#Male)  |     37.159   2.822577    13.16   0.000
  (2#Male) vs (1#Female)  |  -10.25064   1.006447   -10.18   0.000
(2#Female) vs (1#Female)  |   6.522856    1.13971     5.72   0.000
  (3#Male) vs (1#Female)  |  -.0062748   1.030005    -0.01   0.995
(3#Female) vs (1#Female)  |   15.53392   2.550404     6.09   0.000
  (2#Female) vs (2#Male)  |   16.77349   1.222358    13.72   0.000
    (3#Male) vs (2#Male)  |   10.24436   1.120772     9.14   0.000
  (3#Female) vs (2#Male)  |   25.78456   2.588393     9.96   0.000
  (3#Male) vs (2#Female)  |  -6.529131   1.241826    -5.26   0.000
(3#Female) vs (2#Female)  |   9.011069   2.643063     3.41   0.001
  (3#Female) vs (3#Male)  |    15.5402   2.597644     5.98   0.000
------------------------------------------------------------------

This produces a lot of comparisons; if we only cared about the comparisons one way (e.g. gender differences within each class), we can restrict our output to these results.

. margins sex@class, contrast(pv nowald)

Contrasts of adjusted predictions                        Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()

------------------------------------------------------------
                    |            Delta-method
                    |   Contrast   std. err.      t    P>|t|
--------------------+---------------------------------------
          sex@class |
(Female vs base) 1  |   21.62507   1.509999    14.32   0.000
(Female vs base) 2  |   16.77349   1.222358    13.72   0.000
(Female vs base) 3  |    15.5402   2.597644     5.98   0.000
------------------------------------------------------------

Note that the results are identical, it’s just a nicer output.

The pv option requests p-values instead of confidence intervals; the nowald option suppresses an omnibus test of any difference between classes (e.g. ANOVA output).

R (emmeans)

summary(mod1 <- lm(y ~ class*sex, data = m))


Call:
lm(formula = y ~ class * sex, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-72.029 -13.285   0.464  13.741  67.964 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)        50.611      1.368  36.998  < 2e-16 ***
class2             11.374      1.573   7.230 6.12e-13 ***
class3             21.619      1.588  13.610  < 2e-16 ***
sexFemale          21.625      1.510  14.321  < 2e-16 ***
class2:sexFemale   -4.852      1.943  -2.497   0.0126 *  
class3:sexFemale   -6.085      3.005  -2.025   0.0429 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 20.06 on 2994 degrees of freedom
Multiple R-squared:  0.1343,    Adjusted R-squared:  0.1329 
F-statistic: 92.91 on 5 and 2994 DF,  p-value: < 2.2e-16

pairs(emmeans(mod1, ~ sex*class), adjust = "none")

 contrast                       estimate   SE   df t.ratio p.value
 Male class1 - Female class1   -21.62507 1.51 2994 -14.321  <.0001
 Male class1 - Male class2     -11.37444 1.57 2994  -7.230  <.0001
 Male class1 - Female class2   -28.14793 1.66 2994 -16.939  <.0001
 Male class1 - Male class3     -21.61880 1.59 2994 -13.610  <.0001
 Male class1 - Female class3   -37.15900 2.82 2994 -13.165  <.0001
 Female class1 - Male class2    10.25064 1.01 2994  10.185  <.0001
 Female class1 - Female class2  -6.52286 1.14 2994  -5.723  <.0001
 Female class1 - Male class3     0.00627 1.03 2994   0.006  0.9951
 Female class1 - Female class3 -15.53392 2.55 2994  -6.091  <.0001
 Male class2 - Female class2   -16.77349 1.22 2994 -13.722  <.0001
 Male class2 - Male class3     -10.24436 1.12 2994  -9.140  <.0001
 Male class2 - Female class3   -25.78456 2.59 2994  -9.962  <.0001
 Female class2 - Male class3     6.52913 1.24 2994   5.258  <.0001
 Female class2 - Female class3  -9.01107 2.64 2994  -3.409  0.0007
 Male class3 - Female class3   -15.54020 2.60 2994  -5.982  <.0001

This produces a lot of comparisons; if we only cared about the comparisons one way (e.g. gender differences within each class), we can restrict our output to these results.

contrast(emmeans(mod1, ~ sex | class), "pairwise")

class = 1:
 contrast      estimate   SE   df t.ratio p.value
 Male - Female    -21.6 1.51 2994 -14.321  <.0001

class = 2:
 contrast      estimate   SE   df t.ratio p.value
 Male - Female    -16.8 1.22 2994 -13.722  <.0001

class = 3:
 contrast      estimate   SE   df t.ratio p.value
 Male - Female    -15.5 2.60 2994  -5.982  <.0001

Note that the results are identical, it’s just a nicer output.

Marginal Slopes

Column

Stata

. regress y i.class##c.age

      Source |       SS           df       MS      Number of obs   =     3,000
-------------+----------------------------------   F(5, 2994)      =     21.15
       Model |  47475.1592         5  9495.03183   Prob > F        =    0.0000
    Residual |  1343957.85     2,994  448.883716   R-squared       =    0.0341
-------------+----------------------------------   Adj R-squared   =    0.0325
       Total |  1391433.01     2,999  463.965657   Root MSE        =    21.187

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       class |
          2  |  -13.19784   3.880448    -3.40   0.001    -20.80646   -5.589225
          3  |  -11.14087   4.189539    -2.66   0.008    -19.35553   -2.926199
             |
         age |  -.4751707   .0630779    -7.53   0.000    -.5988511   -.3514903
             |
 class#c.age |
          2  |   .2484541   .0887425     2.80   0.005     .0744516    .4224565
          3  |   .2903105   .1107395     2.62   0.009     .0731774    .5074437
             |
       _cons |   90.55752   3.009782    30.09   0.000     84.65606    96.45897
------------------------------------------------------------------------------

. margins class, dydx(age) pwcompare(pv)

Pairwise comparisons of average marginal effects

Model VCE: OLS                                           Number of obs = 3,000

Expression: Linear prediction, predict()
dy/dx wrt:  age

-----------------------------------------------------
             |   Contrast Delta-method    Unadjusted
             |      dy/dx   std. err.      t    P>|t|
-------------+---------------------------------------
age          |
       class |
     2 vs 1  |   .2484541   .0887425     2.80   0.005
     3 vs 1  |   .2903105   .1107395     2.62   0.009
     3 vs 2  |   .0418565   .1103668     0.38   0.705
-----------------------------------------------------

R (emmeans)

summary(mod1 <- lm(y ~ class*age, data = m))


Call:
lm(formula = y ~ class * age, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-75.335 -14.841  -0.036  14.469  73.169 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  90.55752    3.00978  30.088  < 2e-16 ***
class2      -13.19784    3.88045  -3.401  0.00068 ***
class3      -11.14087    4.18954  -2.659  0.00787 ** 
age          -0.47517    0.06308  -7.533 6.52e-14 ***
class2:age    0.24845    0.08874   2.800  0.00515 ** 
class3:age    0.29031    0.11074   2.622  0.00880 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 21.19 on 2994 degrees of freedom
Multiple R-squared:  0.03412,   Adjusted R-squared:  0.03251 
F-statistic: 21.15 on 5 and 2994 DF,  p-value: < 2.2e-16

pairs(emtrends(mod1, ~ class, var = "age"), adjust = "none")

 contrast        estimate     SE   df t.ratio p.value
 class1 - class2  -0.2485 0.0887 2994  -2.800  0.0051
 class1 - class3  -0.2903 0.1107 2994  -2.622  0.0088
 class2 - class3  -0.0419 0.1104 2994  -0.379  0.7045

Diff-in-diff

Diff-in-Diff

Sometimes the comparison we’re interested in is a comparison between comparisons. Imagine testing a treatment effect on multiple groups, and wanting to test whether the treatment effect differs per group. Essentially this is testing the interaction effect; however, if there are more than two groups, coefficients are only reported comparing to the baseline. Additionally, in non-linear regression, marginal effects may be desired.

Note that while the econometrical “diff-in-diff” analysis is a special case of this, this Diff-in-Diff refers to the more general concept of an interaction and testing comparisons between comparisons.

Example

We continue with the sex-by-group interaction, asking whether the gender differences (the first comparison) differ by group (the second comparison).

Estimating the gender difference within each group shows us, for example, that in group 1, the gender difference is 21.6 and in group 2, the gender difference is 16.8. Therefore the diff-in-diff estimate should be -4.8, which we see both in the regression output (2#female or group2:sexfemale) as well as in the actual diff-in-diff estimates.

Column

Stata

. regress y i.class##i.sex

      Source |       SS           df       MS      Number of obs   =     3,000
-------------+----------------------------------   F(5, 2994)      =     92.91
       Model |  186898.199         5  37379.6398   Prob > F        =    0.0000
    Residual |  1204534.81     2,994  402.316235   R-squared       =    0.1343
-------------+----------------------------------   Adj R-squared   =    0.1329
       Total |  1391433.01     2,999  463.965657   Root MSE        =    20.058

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       class |
          2  |   11.37444   1.573314     7.23   0.000     8.289552    14.45932
          3  |    21.6188   1.588487    13.61   0.000     18.50416    24.73344
             |
         sex |
     Female  |   21.62507   1.509999    14.32   0.000     18.66433    24.58581
             |
   class#sex |
   2#Female  |  -4.851582   1.942744    -2.50   0.013     -8.66083   -1.042333
   3#Female  |  -6.084875   3.004638    -2.03   0.043    -11.97624   -.1935115
             |
       _cons |    50.6107   1.367932    37.00   0.000     47.92852    53.29288
------------------------------------------------------------------------------

. margins sex@class, contrast(pv nowald)

Contrasts of adjusted predictions                        Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()

------------------------------------------------------------
                    |            Delta-method
                    |   Contrast   std. err.      t    P>|t|
--------------------+---------------------------------------
          sex@class |
(Female vs base) 1  |   21.62507   1.509999    14.32   0.000
(Female vs base) 2  |   16.77349   1.222358    13.72   0.000
(Female vs base) 3  |    15.5402   2.597644     5.98   0.000
------------------------------------------------------------

. margins class, dydx(sex) pwcompare(pv)

Pairwise comparisons of conditional marginal effects

Model VCE: OLS                                           Number of obs = 3,000

Expression: Linear prediction, predict()
dy/dx wrt:  1.sex

-----------------------------------------------------
             |   Contrast Delta-method    Unadjusted
             |      dy/dx   std. err.      t    P>|t|
-------------+---------------------------------------
0.sex        |  (base outcome)
-------------+---------------------------------------
1.sex        |
       class |
     2 vs 1  |  -4.851582   1.942744    -2.50   0.013
     3 vs 1  |  -6.084875   3.004638    -2.03   0.043
     3 vs 2  |  -1.233294   2.870873    -0.43   0.668
-----------------------------------------------------
Note: dy/dx for factor levels is the discrete change
      from the base level.

Note that this works because sex is binary, thus it’s coefficient (obtained via dydx) is for a 1-unit change, from 0 to 1. If you had a more complicated situation (e.g. there was a third gender category but you only wanted to compare male versus female), you could instead manually define your contrast.

. margins {sex +1 -1}#{class +1 -1 0}, contrast(pv nowald)

Contrasts of adjusted predictions                        Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()

-----------------------------------------------------
             |            Delta-method
             |   Contrast   std. err.      t    P>|t|
-------------+---------------------------------------
   sex#class |
    (1) (1)  |  -4.851582   1.942744    -2.50   0.013
-----------------------------------------------------

. margins {sex +1 -1}#{class +1 -1 0} {sex +1 -1}#{class +1 0 -1} {sex +1 -1}#{
> class 0 +1 -1}, contrast(pv nowald)

Contrasts of adjusted predictions                        Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()

-----------------------------------------------------
             |            Delta-method
             |   Contrast   std. err.      t    P>|t|
-------------+---------------------------------------
   sex#class |
    (1) (1)  |  -4.851582   1.942744    -2.50   0.013
    (1) (2)  |  -6.084875   3.004638    -2.03   0.043
    (1) (3)  |  -1.233294   2.870873    -0.43   0.668
-----------------------------------------------------

Here the {variable # # #} notation defines a contrast - the values correspond to a linear equation for the levels. E.g. if we wanted to test male = female, or male - female = 0, we could write this as (+1)male + (-1)female = 0, therefore the contrast is +1 -1 (in sex, male is the lowest level, female is the higher level). For class, we want to compare each pair of classs, so class1 = class2 is equivalent to (+1)class1 + (-1)class2 + (0)class3 = 0, whose contrast is +1 -1 0.

If you wanted a diff-in-diff-in-diff estimate (e.g. say you’ve got males and females, and some undergo each of treatment 0 and treatment 1, and you want to test whether the difference in pre-post changes amongst men is significantly different than the difference in pre-post changes amongst women), you can run something like margins sex#treatment, dydx(prepost) contrast(pv nowald).

R (emmeans)

summary(mod1 <- lm(y ~ class*sex, data = m))


Call:
lm(formula = y ~ class * sex, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-72.029 -13.285   0.464  13.741  67.964 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)        50.611      1.368  36.998  < 2e-16 ***
class2             11.374      1.573   7.230 6.12e-13 ***
class3             21.619      1.588  13.610  < 2e-16 ***
sexFemale          21.625      1.510  14.321  < 2e-16 ***
class2:sexFemale   -4.852      1.943  -2.497   0.0126 *  
class3:sexFemale   -6.085      3.005  -2.025   0.0429 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 20.06 on 2994 degrees of freedom
Multiple R-squared:  0.1343,    Adjusted R-squared:  0.1329 
F-statistic: 92.91 on 5 and 2994 DF,  p-value: < 2.2e-16

contrast(emmeans(mod1, ~ sex | class), "pairwise")

class = 1:
 contrast      estimate   SE   df t.ratio p.value
 Male - Female    -21.6 1.51 2994 -14.321  <.0001

class = 2:
 contrast      estimate   SE   df t.ratio p.value
 Male - Female    -16.8 1.22 2994 -13.722  <.0001

class = 3:
 contrast      estimate   SE   df t.ratio p.value
 Male - Female    -15.5 2.60 2994  -5.982  <.0001

pairs(pairs(emmeans(mod1, ~ sex | class)), by = NULL, adjust = "none")

 contrast                                        estimate   SE   df t.ratio
 (Male - Female class1) - (Male - Female class2)    -4.85 1.94 2994  -2.497
 (Male - Female class1) - (Male - Female class3)    -6.08 3.00 2994  -2.025
 (Male - Female class2) - (Male - Female class3)    -1.23 2.87 2994  -0.430
 p.value
  0.0126
  0.0429
  0.6675

Interaction Plots

Plotting the results from marginal means and slopes to create interactoin plots, is where the differences between Stata and R start to show. Stata can plot the results of any margins call, with varying levels of usefulness. For each particular variation of the plot, we may need to find a separate way to plot it in R.

Interaction plots involved plotting the marginal means over several values of two (or more) predictor variables to examine the interaction in more detail. One predictor is plotted along the X-axis; the other predictor(s) are used to produce multiple lines. If the secondary predictor(s) are categorical, usually one line is plotted per group; if the secondary predictor(s) is continuous, then lines are drawn at several meaningful values of the predictor.

Example

Here we again model the interaction of group with age. From the regression coefficients and even the marginal means, it’s hard to really identify the full pattern. However, looking at the plots, we can see that amongst younger individuals, group 1 has the highest predicted outcome, whereas that reverses amongst the elderly with group 1 having the lowest predicted outcome, or at least no difference than the other groups.

Another way to interpret the plot would be that we see a decline in the outcome as individuals age, though the decline is more sharp in group 1 than the other groups.

Code

Stata

. regress y i.class##c.age

      Source |       SS           df       MS      Number of obs   =     3,000
-------------+----------------------------------   F(5, 2994)      =     21.15
       Model |  47475.1592         5  9495.03183   Prob > F        =    0.0000
    Residual |  1343957.85     2,994  448.883716   R-squared       =    0.0341
-------------+----------------------------------   Adj R-squared   =    0.0325
       Total |  1391433.01     2,999  463.965657   Root MSE        =    21.187

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       class |
          2  |  -13.19784   3.880448    -3.40   0.001    -20.80646   -5.589225
          3  |  -11.14087   4.189539    -2.66   0.008    -19.35553   -2.926199
             |
         age |  -.4751707   .0630779    -7.53   0.000    -.5988511   -.3514903
             |
 class#c.age |
          2  |   .2484541   .0887425     2.80   0.005     .0744516    .4224565
          3  |   .2903105   .1107395     2.62   0.009     .0731774    .5074437
             |
       _cons |   90.55752   3.009782    30.09   0.000     84.65606    96.45897
------------------------------------------------------------------------------

. margins class, at(age = (20 60))

Adjusted predictions                                     Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()
1._at: age = 20
2._at: age = 60

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
   _at#class |
        1 1  |    81.0541   1.793005    45.21   0.000     77.53845    84.56975
        1 2  |   72.82534   1.284642    56.69   0.000     70.30647    75.34421
        1 3  |   75.71945    1.27105    59.57   0.000     73.22723    78.21167
        2 1  |   62.04727   1.037397    59.81   0.000     60.01319    64.08136
        2 2  |   63.75668   1.517933    42.00   0.000     60.78038    66.73298
        2 3  |   68.32504   2.782515    24.56   0.000     62.86921    73.78088
------------------------------------------------------------------------------

. marginsplot, recastci(rarea)  ciopts(fcolor(%25) lcolor(%25))

Variables that uniquely identify margins: age class

The recastci(rarea) option plots the confidence region, rather than just confident intervals at each estimated point (to more closely mimic R’s output). The ciopts() allows you to pass options for the confidence intervals; these particular options just make the confidence region (and it’s outline) transparent.

Due to this being a linear model as well as age entering linearly (as opposed to say quadratically), two points are sufficient to fit a line. If you fit a non-linear model (such as logistic regression) or allowed age to enter non-linearlly, you may want to add more points to the plot. There is a trade-off between speed (more points take longer to estimate) and smoothness of the resulting curve.

. regress y i.class##c.age##c.age

      Source |       SS           df       MS      Number of obs   =     3,000
-------------+----------------------------------   F(8, 2991)      =     14.96
       Model |  53538.5729         8  6692.32162   Prob > F        =    0.0000
    Residual |  1337894.43     2,991  447.306731   R-squared       =    0.0385
-------------+----------------------------------   Adj R-squared   =    0.0359
       Total |  1391433.01     2,999  463.965657   Root MSE        =     21.15

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       class |
          2  |  -22.72613   14.60206    -1.56   0.120    -51.35722    5.904968
          3  |  -5.253225   15.40844    -0.34   0.733    -35.46544      24.959
             |
         age |    .384345    .536903     0.72   0.474    -.6683915    1.437081
             |
 class#c.age |
          2  |   .8892081    .726491     1.22   0.221    -.5352645    2.313681
          3  |   .1727145   .8296354     0.21   0.835    -1.453999    1.799428
             |
 c.age#c.age |  -.0098622    .006118    -1.61   0.107    -.0218581    .0021337
             |
 class#c.age#|
       c.age |
          2  |   -.009385     .00873    -1.08   0.282    -.0265025    .0077325
          3  |  -.0008709    .010928    -0.08   0.936    -.0222981    .0205564
             |
       _cons |    72.8557   11.38487     6.40   0.000     50.53273    95.17866
------------------------------------------------------------------------------

. margins class, at(age = (20(5)60))

Adjusted predictions                                     Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()
1._at: age = 20
2._at: age = 25
3._at: age = 30
4._at: age = 35
5._at: age = 40
6._at: age = 45
7._at: age = 50
8._at: age = 55
9._at: age = 60

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
   _at#class |
        1 1  |   76.59771   3.293341    23.26   0.000     70.14026    83.05515
        1 2  |   67.90175   2.045099    33.20   0.000      63.8918    71.91169
        1 3  |   74.45042   1.660151    44.85   0.000     71.19527    77.70557
        2 1  |   76.30043   2.102181    36.30   0.000     72.17856     80.4223
        2 2  |   69.93889   1.169676    59.79   0.000     67.64544    72.23234
        2 3  |   74.82077   .9637658    77.63   0.000     72.93106    76.71048
        3 1  |   75.51005   1.312605    57.53   0.000     72.93634    78.08375
        3 2  |   71.01367   .8150404    87.13   0.000     69.41557    72.61177
        3 3  |   74.65447   1.047226    71.29   0.000     72.60111    76.70782
        4 1  |   74.22655   .9758115    76.07   0.000     72.31322    76.13988
        4 2  |   71.12609   .8578454    82.91   0.000     69.44406    72.80812
        4 3  |   73.95151   1.233883    59.93   0.000     71.53216    76.37085
        5 1  |   72.44994   .9291129    77.98   0.000     70.62818    74.27171
        5 2  |   70.27615    .910904    77.15   0.000     68.49009    72.06221
        5 3  |    72.7119   1.303433    55.78   0.000     70.15618    75.26761
        6 1  |   70.18022   .8796416    79.78   0.000     68.45546    71.90499
        6 2  |   68.46385   .8799818    77.80   0.000     66.73842    70.18928
        6 3  |   70.93563   1.532553    46.29   0.000     67.93066    73.94059
        7 1  |   67.41739   .7502806    89.86   0.000     65.94627    68.88851
        7 2  |   65.68919    .989977    66.35   0.000     63.74809     67.6303
        7 3  |    68.6227   2.329015    29.46   0.000     64.05607    73.18934
        8 1  |   64.16145   .8193357    78.31   0.000     62.55493    65.76797
        8 2  |   61.95217   1.561682    39.67   0.000     58.89009    65.01425
        8 3  |   65.77313   3.756076    17.51   0.000     58.40837    73.13788
        9 1  |    60.4124    1.44948    41.68   0.000     57.57032    63.25448
        9 2  |   57.25279   2.593175    22.08   0.000      52.1682    62.33738
        9 3  |   62.38689   5.728209    10.89   0.000     51.15527    73.61852
------------------------------------------------------------------------------

. marginsplot, recastci(rarea)  ciopts(fcolor(%25) lcolor(%25)) recast(line)

Variables that uniquely identify margins: age class

recast(line) removes the points to make the plot a bit cleaner.

R (interactions)

summary(mod1 <- lm(y ~ class*age, data = m))


Call:
lm(formula = y ~ class * age, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-75.335 -14.841  -0.036  14.469  73.169 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  90.55752    3.00978  30.088  < 2e-16 ***
class2      -13.19784    3.88045  -3.401  0.00068 ***
class3      -11.14087    4.18954  -2.659  0.00787 ** 
age          -0.47517    0.06308  -7.533 6.52e-14 ***
class2:age    0.24845    0.08874   2.800  0.00515 ** 
class3:age    0.29031    0.11074   2.622  0.00880 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 21.19 on 2994 degrees of freedom
Multiple R-squared:  0.03412,   Adjusted R-squared:  0.03251 
F-statistic: 21.15 on 5 and 2994 DF,  p-value: < 2.2e-16

interact_plot(mod1, pred = age, modx = class, interval = TRUE)

R (emmeans)

summary(mod1 <- lm(y ~ class*age, data = m))


Call:
lm(formula = y ~ class * age, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-75.335 -14.841  -0.036  14.469  73.169 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  90.55752    3.00978  30.088  < 2e-16 ***
class2      -13.19784    3.88045  -3.401  0.00068 ***
class3      -11.14087    4.18954  -2.659  0.00787 ** 
age          -0.47517    0.06308  -7.533 6.52e-14 ***
class2:age    0.24845    0.08874   2.800  0.00515 ** 
class3:age    0.29031    0.11074   2.622  0.00880 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 21.19 on 2994 degrees of freedom
Multiple R-squared:  0.03412,   Adjusted R-squared:  0.03251 
F-statistic: 21.15 on 5 and 2994 DF,  p-value: < 2.2e-16

emmip(mod1, class ~ age, at = list(age = c(20, 30, 40, 50, 60)), CIs = TRUE)

R (ggeffects)

summary(mod1 <- lm(y ~ class*age, data = m))


Call:
lm(formula = y ~ class * age, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-75.335 -14.841  -0.036  14.469  73.169 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  90.55752    3.00978  30.088  < 2e-16 ***
class2      -13.19784    3.88045  -3.401  0.00068 ***
class3      -11.14087    4.18954  -2.659  0.00787 ** 
age          -0.47517    0.06308  -7.533 6.52e-14 ***
class2:age    0.24845    0.08874   2.800  0.00515 ** 
class3:age    0.29031    0.11074   2.622  0.00880 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 21.19 on 2994 degrees of freedom
Multiple R-squared:  0.03412,   Adjusted R-squared:  0.03251 
F-statistic: 21.15 on 5 and 2994 DF,  p-value: < 2.2e-16

plot(ggeffect(mod1, c("age [20, 30, 40, 50, 60]", "class")))

Diff-in-Diff/Interrupted Time Series

Interrupted Time Series and Diff-in-Diff plots

These plots differ from traditional interaction plots in that the range of each individual trajectory is restricted. While this can occasionally be of interest in traditional interaction plots (say we’re interacting gender by height; we may wish to restrict the range of height for women to below 6’, but allow the range of men to go up to 6’6”), it’s always of interest in interrupted time series or differences-in-differences plots. If we did not account for this, all lines would be plotted over the entire range.

Example

In the first example, we generate a dummy variable those age above 40, the interact it with age. This allows us to estimate a trend on age for those below 40, and a separate trend for those above 40. This is traditionally called an “interrupted time series”, though the idea may be used in other situations where there’s an interaction of any continuous variableq (which may or may not represent time in some fashion) and a categorical variable dividing the continuous variable into separate “sections”.

In the second example, a differences-in-differences model is tested by adding an additional interaction with class. This allows estimating a separate pair of trends for each class.

Code

Stata

Generate a variable indicating whether age is above 40 before fitting the model.

. generate above40 = age > 40

. regress y i.above40##c.age

      Source |       SS           df       MS      Number of obs   =     3,000
-------------+----------------------------------   F(3, 2996)      =     33.97
       Model |  45772.6752         3  15257.5584   Prob > F        =    0.0000
    Residual |  1345660.33     2,996  449.152313   R-squared       =    0.0329
-------------+----------------------------------   Adj R-squared   =    0.0319
       Total |  1391433.01     2,999  463.965657   Root MSE        =    21.193

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
   1.above40 |   22.28839   5.698579     3.91   0.000     11.11487    33.46191
         age |  -.0686643   .0887806    -0.77   0.439    -.2427415    .1054129
             |
     above40#|
       c.age |
          1  |  -.5400972   .1329699    -4.06   0.000    -.8008187   -.2793757
             |
       _cons |   74.76941   2.739267    27.30   0.000     69.39837    80.14044
------------------------------------------------------------------------------

. margins, at(age = (20 40) above40 = 0) ///
>          at(age = (40 60) above40 = 1)

Adjusted predictions                                     Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()
1._at: above40 =  0
       age     = 20
2._at: above40 =  0
       age     = 40
3._at: above40 =  1
       age     = 40
4._at: above40 =  1
       age     = 60

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   73.39612   1.056808    69.45   0.000     71.32398    75.46826
          2  |   72.02283   1.017694    70.77   0.000     70.02738    74.01828
          3  |   72.70734   1.150884    63.18   0.000     70.45073    74.96394
          4  |   60.53211    1.12271    53.92   0.000     58.33074    62.73347
------------------------------------------------------------------------------

. marginsplot, recastci(rarea) ciopts(fcolor(%25) lcolor(%25)) ///
>              legend(off)

Variables that uniquely identify margins: age _atopt
Multiple at() options specified:
      _atoption=1: age = (20 40) above40 = 0
      _atoption=2: age = (40 60) above40 = 1

Note the use of two separate at statements to ensure that each trend is plotted only in the appropriate age range. The legend is suppressed as it is obvious in this situation.

For a full diff-in-diff plot, simply add the additional term as usual.

. regress y i.class##i.above40##c.age

      Source |       SS           df       MS      Number of obs   =     3,000
-------------+----------------------------------   F(11, 2988)     =     10.96
       Model |  53973.7541        11  4906.70491   Prob > F        =    0.0000
    Residual |  1337459.25     2,988  447.610191   R-squared       =    0.0388
-------------+----------------------------------   Adj R-squared   =    0.0353
       Total |  1391433.01     2,999  463.965657   Root MSE        =    21.157

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       class |
          2  |  -10.88013   8.740776    -1.24   0.213    -28.01868    6.258421
          3  |  -3.526033   8.562967    -0.41   0.681    -20.31594    13.26387
             |
   1.above40 |     17.168   10.00685     1.72   0.086    -2.453016    36.78902
             |
       class#|
     above40 |
        2 1  |   16.39086   14.46734     1.13   0.257     -11.9761    44.75782
        3 1  |   -7.68618   22.34638    -0.34   0.731    -51.50204    36.12968
             |
         age |  -.1626578    .220437    -0.74   0.461    -.5948815    .2695659
             |
 class#c.age |
          2  |   .1987596   .2644373     0.75   0.452     -.319738    .7172571
          3  |   .0879317   .2661492     0.33   0.741    -.4339225     .609786
             |
     above40#|
       c.age |
          1  |  -.4411289   .2555264    -1.73   0.084    -.9421543    .0598966
             |
       class#|
     above40#|
       c.age |
        2 1  |  -.3434769   .3518112    -0.98   0.329    -1.033294    .3463398
        3 1  |   .1697788   .5123409     0.33   0.740    -.8347979    1.174356
             |
       _cons |   80.04815   7.439131    10.76   0.000     65.46182    94.63449
------------------------------------------------------------------------------

. margins class, at(age = (20 40) above40 = 0) ///
>                at(age = (40 60) above40 = 1)

Adjusted predictions                                     Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()
1._at: above40 =  0
       age     = 20
2._at: above40 =  0
       age     = 40
3._at: above40 =  1
       age     = 40
4._at: above40 =  1
       age     = 60

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
   _at#class |
        1 1  |     76.795   3.167156    24.25   0.000     70.58497    83.00502
        1 2  |   69.89006   1.791452    39.01   0.000     66.37746    73.40267
        1 3  |    75.0276   1.461485    51.34   0.000     72.16198    77.89322
        2 1  |   73.54184   1.896068    38.79   0.000     69.82411    77.25957
        2 2  |    70.6121   1.557486    45.34   0.000     67.55824    73.66595
        2 3  |   73.53308   2.020516    36.39   0.000     69.57133    77.49482
        3 1  |   73.06469   1.647357    44.35   0.000     69.83462    76.29476
        3 2  |   72.78672   1.908052    38.15   0.000     69.04549    76.52795
        3 3  |    72.1609   3.371785    21.40   0.000     65.54964    78.77215
        4 1  |   60.98895   1.309189    46.59   0.000     58.42195    63.55596
        4 2  |   57.81664   2.445232    23.64   0.000     53.02213    62.61115
        4 3  |   65.23937   6.031687    10.82   0.000     53.41269    77.06605
------------------------------------------------------------------------------

. marginsplot, recast(line) noci ///
>   plot1opts(lcolor("red")) ///
>   plot2opts(lcolor("blue")) ///
>   plot3opts(lcolor("red") lpattern("dash")) ///
>   plot4opts(lcolor("blue") lpattern("dash")) ///
>   plot5opts(lcolor("red") lpattern("dash_dot")) ///
>   plot6opts(lcolor("blue") lpattern("dash_dot")) ///
>   legend(order(1 "Class 1" 3 "Class 2" 5 "Class 3"))

Variables that uniquely identify margins: age class _atopt
Multiple at() options specified:
      _atoption=1: age = (20 40) above40 = 0
      _atoption=2: age = (40 60) above40 = 1

The plot requires a lot more configuring to make it look nice; I’ve even suppressed the confidence bounds to make it more readable.

R (emmeans)

m$above40 <- m$age > 40
summary(mod1 <- lm(y ~ above40*age, data = m))


Call:
lm(formula = y ~ above40 * age, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-73.127 -14.764   0.256  14.611  73.865 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)     74.76941    2.73927  27.295  < 2e-16 ***
above40TRUE     22.28839    5.69858   3.911 9.39e-05 ***
age             -0.06866    0.08878  -0.773    0.439    
above40TRUE:age -0.54010    0.13297  -4.062 4.99e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 21.19 on 2996 degrees of freedom
Multiple R-squared:  0.0329,    Adjusted R-squared:  0.03193 
F-statistic: 33.97 on 3 and 2996 DF,  p-value: < 2.2e-16

I do not currently know of a way to do this entirely within emmeans without resorting to manual plotting. If you do know a way to do it, please submit an issue or otherwise reach out to me!

em_under <- emmeans(mod1, ~ age, at = list(age = c(20, 40), above40 = FALSE))
em_above <- emmeans(mod1, ~ age, at = list(age = c(40, 60), above40 = TRUE))
em_combined <- data.frame(rbind(em_under, em_above))
em_combined$above40 <- c(FALSE, FALSE, TRUE, TRUE)
ggplot(em_combined, aes(x = age, y = emmean, color = above40, group = above40)) +
  geom_ribbon(aes(ymin = lower.CL, ymax = upper.CL, fill = above40), alpha = .2) +
  geom_line(lwd = 2) +
  scale_fill_manual(values = c("FALSE" = "blue", "TRUE" = "red"))  +
  scale_color_manual(values = c("FALSE" = "blue", "TRUE" = "red")) +
  xlab("Age") + ylab("Prediction") + theme(legend.position = "none")

R (marginaleffects)

m$above40 <- m$age > 40
summary(mod1 <- lm(y ~ above40*age, data = m))


Call:
lm(formula = y ~ above40 * age, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-73.127 -14.764   0.256  14.611  73.865 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)     74.76941    2.73927  27.295  < 2e-16 ***
above40TRUE     22.28839    5.69858   3.911 9.39e-05 ***
age             -0.06866    0.08878  -0.773    0.439    
above40TRUE:age -0.54010    0.13297  -4.062 4.99e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 21.19 on 2996 degrees of freedom
Multiple R-squared:  0.0329,    Adjusted R-squared:  0.03193 
F-statistic: 33.97 on 3 and 2996 DF,  p-value: < 2.2e-16

plot_predictions(mod1,
                 by = "age",
                 newdata = data.frame(age = c(20, 40, 40, 60),
                                      above40 = c(FALSE, FALSE, TRUE, TRUE)))

The newdata argument is doing the heavy-lifting here; the by argument is specifying the x-axis.

Note that you can also simply call predictions(mod1, newdata = ...) and plot the results manually, for maximum flexibility.

Odds & Odds Ratios

Logistic regression model coefficients are log-odds, and are usually represented by odds ratios. While all the code given in this document work with logistic models, the marginal effects are on the probability scale. That is, you are obtaining marginal predicted probability. For example, consider a model with a binary outcome and a binary predictor. From the regression results, you could obtain an odds ratio describing the increase in odds of a positive outcome when moving from group 0 to group 1, but the marginal results would estimate the probability of a positive outcome in each group.

While I would argue that predicted probabilities are more interpretable than odds ratios anyways, there may be scenarios where you would want to obtain odds and odds ratios. In particular, for a categorical predictor, you might want to obtain the odds for each level, as well as odds ratios between each pair of levels.

Example

The logistic regression model has the 3-level categorical predictor. Since group 1 is the reference category, we obtain from this model:

The odds in group 1, in the intercept (_con) row: .371
The odds ratio between groups 2 and 1, in the group 2 row: .823.
The odds ratio between groups 3 and 1, in the group 3 row: .674.

The marginal odds give us the odds in each group at .371 in group 1 (the same as obtained in the model results), .306 in group 2, and .250 in group 3.

The marginal odds ratios gives us .823 for 2 vs 1 and .674 for 3 vs 1 (the same as obtained in the model results), as well as .819 for 3 vs 2.

Code

Stata

. logit yc i.class, nolog or

Logistic regression                                     Number of obs =  3,000
                                                        LR chi2(2)    =  12.39
                                                        Prob > chi2   = 0.0020
Log likelihood = -1651.6397                             Pseudo R2     = 0.0037

------------------------------------------------------------------------------
          yc | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       class |
          2  |   .8231057   .0789757    -2.03   0.042     .6819996    .9934067
          3  |   .6747704   .0779699    -3.40   0.001     .5380213     .846277
             |
       _cons |   .3718535   .0241593   -15.23   0.000      .327393    .4223519
------------------------------------------------------------------------------
Note: _cons estimates baseline odds.

Below, the references to each coefficient (e.g. _b[2.class]) was obtained by calling logit, coeflegend after the original logit call.

. lincom _b[_cons], or

 ( 1)  [yc]_cons = 0

------------------------------------------------------------------------------
          yc | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |   .3718535   .0241593   -15.23   0.000      .327393    .4223519
------------------------------------------------------------------------------

. lincom _b[_cons] + _b[2.class], or

 ( 1)  [yc]2.class + [yc]_cons = 0

------------------------------------------------------------------------------
          yc | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |   .3060748   .0216103   -16.77   0.000     .2665193    .3515008
------------------------------------------------------------------------------

. lincom _b[_cons] + _b[3.class], or

 ( 1)  [yc]3.class + [yc]_cons = 0

------------------------------------------------------------------------------
          yc | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |   .2509158   .0239763   -14.47   0.000     .2080613     .302597
------------------------------------------------------------------------------

. lincom _b[2.class], or

 ( 1)  [yc]2.class = 0

------------------------------------------------------------------------------
          yc | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |   .8231057   .0789757    -2.03   0.042     .6819996    .9934067
------------------------------------------------------------------------------

. lincom _b[3.class], or

 ( 1)  [yc]3.class = 0

------------------------------------------------------------------------------
          yc | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |   .6747704   .0779699    -3.40   0.001     .5380213     .846277
------------------------------------------------------------------------------

. lincom _b[3.class] - _b[2.class], or

 ( 1)  - [yc]2.class + [yc]3.class = 0

------------------------------------------------------------------------------
          yc | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |   .8197858   .0973987    -1.67   0.094     .6494852    1.034741
------------------------------------------------------------------------------

R (`interactions`)

exp(summary(mod1 <- glm(yc ~ class, data = m, family = "binomial"))$coef)

             Estimate Std. Error      z value Pr(>|z|)
(Intercept) 0.3718535   1.067127 2.439327e-07 1.000000
class2      0.8231057   1.100702 1.314787e-01 1.043382
class3      0.6747704   1.122491 3.322567e-02 1.000663

To-do

Introduction

Sidebar

Layout

Column

Marginal Effects

Data and Libraries

Sidebar

Data

Libraries

Code

Stata

R Data

R Packages

emmeans

ggeffects

marginaleffects

interactions

Software Versions

Categorical Variables

Sidebar

Marginal Means for Categorical Variables

Example

Code

Stata

R (emmeans)

R (ggeffects)

R (marginaleffects)

Factor labels

Math

Interactions between Categorical Variables

Sidebar

Interactions between Categorical Variables

Example

Code

Stata

R (emmeans)

R (ggeffects)

R (marginaleffects)

Math

Continuous Variables

Sidebar

Marginal Means for Continuous Variables

Code

Stata

R (emmeans)

R (ggeffects)

R (marginaleffects)

Math

Categorical and Continuous Interactions

Sidebar

Categorical and Continuous Interactions

Code

Stata

R (emmeans)

R (ggeffects)

Marginal Slopes

Sidebar

Marginal Slopes

Example

Code

Stata

R (emmeans)

Math

Categorical by continuous interaction

sidebar

Marginal Slopes with Interactions

Example

Code

Stata

R (emmeans)

Math

Categorical Variables

Sidebar

Pairwise Comparisons of Categorical Variables

Example

Code

Stata

R (emmeans)

Categorical Interactions

Sidebar

R (`interactions`)