Introduction

Column

Marginal Effects

After running a regression model of some sort, the common way of interpreting the relationships between predictors and the outcome is by interpreting the regression coefficients. In many situations these interpretations are straightforward, however, in some settings the interpretations can be tricky, or the coefficients can simply not be interpreted.

These complications arise most commonly when a model involves an interaction or moderation. In these sort of models, interpretation of the coefficients requires special care due to the relationship between the various interacted predictors, and the set of information obtainable from the coefficients is often limited without resorting to significant algebra.

Marginal effects offer an improvement over simple coefficient interpretation. Marginal means allow you to estimate the average response under a set of conditions; e.g. the average response for each racial group, or the average response when age and weight are at certain values.

Marginal slopes estimate the slope between a predictor and the outcome when other variables are at some specific value; e.g. the average slope on age within each gender, or the average slope on age when weight is at certain values.

In either case, in addition to estimating these marginal effects, we ;can easily test hypotheses of equality between them via pairwise comparisons. For example, is the average response different by ethnicity, or is the slope on age different between genders.

If you are familiar with interpreting regression coefficients, and specifically interactions, you may have some idea of how to start addressing all the above examples. However, to complete answer these research questions would require more than simply a table of regression coefficients. With marginal effects, one additional set of results can address all these questions.

Finally, marginal effect estimation is a step towards creating informative visualizations of relationships such as interaction plots.

Data and Libraries

Code

Stata

The webuse command can load the directly.

. webuse margex
(Artificial data for margins)

The marginaleffects package in R doesn’t like a variable being named “group”, so we rename it.

. rename group class

. summarize, sep(0)

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
           y |      3,000    69.73357    21.53986          0      146.3
     outcome |      3,000    .1696667    .3754023          0          1
         sex |      3,000    .5006667    .5000829          0          1
       class |      3,000       1.828    .7732714          1          3
         age |      3,000      39.799    11.54174         20         60
    distance |      3,000    58.58566    181.3987        .25   748.8927
         ycn |      3,000    74.47263    22.01264          0      153.8
          yc |      3,000    .2413333    .4279633          0          1
   treatment |      3,000    .5006667    .5000829          0          1
    agegroup |      3,000    2.263667    .8316913          1          3
         arm |      3,000    1.970667    .7899457          1          3

R Data

The haven package is used to read the data from Stata into R.

library(haven)
m <- read_dta("http://www.stata-press.com/data/r18/margex.dta")
m <- data.frame(m)
m$sex <- as_factor(m$sex)
names(m)[names(m) == "group"] <- "class"
m$class <- as_factor(m$class)
head(m)
      y outcome    sex class age distance   ycn yc treatment agegroup arm
1 102.6       1 Female     1  55    11.75  97.3  1         1        3   1
2  77.2       0 Female     1  60    30.75  78.6  0         1        3   1
3  80.5       0 Female     1  27    14.25  54.5  0         1        1   1
4  82.5       1 Female     1  60    29.25  98.4  1         1        3   1
5  71.6       1 Female     1  52    19.00 105.3  1         1        3   1
6  83.2       1 Female     1  49     2.50  89.7  0         1        3   1

The marginaleffects package doesn’t like a variable being named “group”, so we rename it.

R Packages

There are several packages which can be used to estimate marginal means. So far, this document implements emmeans and ggeffects, and I’ve begun writing up marginaleffects.

emmeans is based on an older package, lsmeans, and is more flexible and powerful. However, it’s syntax is slightly strange, and there are often several ways to obtain similar results which may or may not be the same. Additionally, while it’s immediate output is clear, accessing the raw data producing the output takes some doing.

ggeffects is a package designed to tie into the dplyr/ggplot universe. It’s results are always a clean data frame which can be manipulated or passed straight into ggplot. However, ggeffects can only estimate marginal means, not marginal slopes or pairwise comparisons. (If I’m wrong about that last statement, please let me know.)

marginaleffects is a newer package which has a much more robust series of estimates. It focuses on observation-level predictions which can then be averaged, rather than estimating marginal effects directly. Theres often multiple ways to obtain the same result which can be very frustrating. It’s a very early version ( r packageVersion("marginaleffects") at time of last updating this document) so it may see breaking changes.

The marginaleffects website has a good write-up comparing between these packages (as well as Stata). This is obviously written with bias towards being pro-marginaleffects so keep that in mind while reading it.

In addition, the interactions package is used to plot interaction effects.

emmeans

library(emmeans)

ggeffects

The ggeffects package requires several other packages to work fully. However, it does not follow proper package framework and install them for you, rather it requires you to ensure you have the following packages installed as well:

  • effects
  • emmeans
  • ggplot2
  • datawizard
library(ggeffects)
library(effects)
library(emmeans)
library(ggplot2)
library(datawizard)

If you do not have those packages loaded, ggeffects will load them on demand (rather than at the time of loading ggeffects). If you do not have a required package installed, ggeffects will instead produce a warning, so keep an eye out for those and install packages as needed.

marginaleffects

library(marginaleffects)

(Using marginaleffects remains a work-in-progress.)

interactions

library(interactions)

Software Versions

This document was last compiled using Stata/SE 18.0 and R 4.3.3.

R package versions at last compilation:

Package Version
emmeans 1.10.0
ggeffects 1.5.1
effects 4.2.2
datawizard 0.10.0
marginaleffects 0.18.0
ggplot2 3.5.0
interactions 1.1.5

Categorical Variables

Code

Stata

. regress y i.class

      Source |       SS           df       MS      Number of obs   =     3,000
-------------+----------------------------------   F(2, 2997)      =     15.48
       Model |  14229.0349         2  7114.51743   Prob > F        =    0.0000
    Residual |  1377203.97     2,997  459.527518   R-squared       =    0.0102
-------------+----------------------------------   Adj R-squared   =    0.0096
       Total |  1391433.01     2,999  463.965657   Root MSE        =    21.437

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       class |
          2  |   .4084991   .8912269     0.46   0.647    -1.338979    2.155977
          3  |   5.373138   1.027651     5.23   0.000     3.358165     7.38811
             |
       _cons |   68.35805   .6190791   110.42   0.000     67.14419    69.57191
------------------------------------------------------------------------------

. margins class

Adjusted predictions                                     Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       class |
          1  |   68.35805   .6190791   110.42   0.000     67.14419    69.57191
          2  |   68.76655   .6411134   107.26   0.000     67.50948    70.02361
          3  |   73.73119   .8202484    89.89   0.000     72.12288    75.33949
------------------------------------------------------------------------------

R (emmeans)

summary(mod1 <- lm(y ~ class, data = m))

Call:
lm(formula = y ~ class, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-73.531 -14.758   0.101  14.536  72.569 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  68.3580     0.6191 110.419  < 2e-16 ***
class2        0.4085     0.8912   0.458    0.647    
class3        5.3731     1.0277   5.229 1.83e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 21.44 on 2997 degrees of freedom
Multiple R-squared:  0.01023,   Adjusted R-squared:  0.009566 
F-statistic: 15.48 on 2 and 2997 DF,  p-value: 2.045e-07
emmeans(mod1, ~ class)
 class emmean    SE   df lower.CL upper.CL
 1       68.4 0.619 2997     67.1     69.6
 2       68.8 0.641 2997     67.5     70.0
 3       73.7 0.820 2997     72.1     75.3

Confidence level used: 0.95 
Note that by default, emmeans returns confidence intervals. If you prefer p-values, you can wrap the emmeans call in test:
test(emmeans(mod1, ~ class))
 class emmean    SE   df t.ratio p.value
 1       68.4 0.619 2997 110.419  <.0001
 2       68.8 0.641 2997 107.261  <.0001
 3       73.7 0.820 2997  89.889  <.0001

R (ggeffects)

summary(mod1 <- lm(y ~ class, data = m))

Call:
lm(formula = y ~ class, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-73.531 -14.758   0.101  14.536  72.569 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  68.3580     0.6191 110.419  < 2e-16 ***
class2        0.4085     0.8912   0.458    0.647    
class3        5.3731     1.0277   5.229 1.83e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 21.44 on 2997 degrees of freedom
Multiple R-squared:  0.01023,   Adjusted R-squared:  0.009566 
F-statistic: 15.48 on 2 and 2997 DF,  p-value: 2.045e-07
ggeffect(mod1, "class")
# Predicted values of y

class | Predicted |       95% CI
--------------------------------
1     |     68.36 | 67.14, 69.57
2     |     68.77 | 67.51, 70.02
3     |     73.73 | 72.12, 75.34

Note that we could have run that as ggeffect(mod1, ~ class), with a formula. However, when we later want to specify certain values of the predictors, we need to use the character version, so I stick with it here.

R (marginaleffects)

summary(mod1 <- lm(y ~ class, data = m))

Call:
lm(formula = y ~ class, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-73.531 -14.758   0.101  14.536  72.569 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  68.3580     0.6191 110.419  < 2e-16 ***
class2        0.4085     0.8912   0.458    0.647    
class3        5.3731     1.0277   5.229 1.83e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 21.44 on 2997 degrees of freedom
Multiple R-squared:  0.01023,   Adjusted R-squared:  0.009566 
F-statistic: 15.48 on 2 and 2997 DF,  p-value: 2.045e-07
predictions(mod1, newdata = datagrid(class = unique))

 class Estimate Std. Error     z Pr(>|z|)   S 2.5 % 97.5 %
     1     68.4      0.619 110.4   <0.001 Inf  67.1   69.6
     2     68.8      0.641 107.3   <0.001 Inf  67.5   70.0
     3     73.7      0.820  89.9   <0.001 Inf  72.1   75.3

Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, y, class 
Type:  response 

The datagrid() function can also take in values. For example,

predictions(mod1, newdata = datagrid(class = c(1, 2, 3)))

 class Estimate Std. Error     z Pr(>|z|)   S 2.5 % 97.5 %
     1     68.4      0.619 110.4   <0.001 Inf  67.1   69.6
     2     68.8      0.641 107.3   <0.001 Inf  67.5   70.0
     3     73.7      0.820  89.9   <0.001 Inf  72.1   75.3

Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, y, class 
Type:  response 

is an alternative to class = unique to specify the levels. (NOTE: These are the “levels” not the values - it just happens that class has the same. See below for what happens when it differs.) We can specify a subset of levels as well.

predictions(mod1, newdata = datagrid(class = c(1, 2)))

 class Estimate Std. Error   z Pr(>|z|)   S 2.5 % 97.5 %
     1     68.4      0.619 110   <0.001 Inf  67.1   69.6
     2     68.8      0.641 107   <0.001 Inf  67.5   70.0

Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, y, class 
Type:  response 

Factor labels

If your factor has labels which are not just the values, you need to refer to those.

mod2 <- lm(y ~ sex, data = m)
str(m$sex)
 Factor w/ 2 levels "Male","Female": 2 2 2 2 2 2 2 2 2 2 ...
 - attr(*, "label")= chr "Sex"
predictions(mod2, newdata = datagrid(sex = c(1,2)))
Error: The "sex" element of the `at` list corresponds to a factor variable. The values entered in the `at` list must be one of the factor levels: "Male", "Female".
predictions(mod2, newdata = datagrid(sex = c("Male", "Female")))

    sex Estimate Std. Error   z Pr(>|z|)   S 2.5 % 97.5 %
 Male       64.6       0.54 119   <0.001 Inf  63.5   65.6
 Female     74.9       0.54 139   <0.001 Inf  73.8   75.9

Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, y, sex 
Type:  response 

Math

Assume a linear regression set up with a single categorical variable, \(G\), with three groups. We fit

\[ E(Y|G) = \beta_0 + \beta_1g_2 + \beta_2g_3, \]

where \(g_2\) and \(g_3\) are dummy variables representing membership in groups 2 and 3 respectively (\(g_{2i} = 1\) if observation \(i\) has \(G = 2\), and equal to 0 otherwise.) Since \(g_1\) is not found in the model, it is the reference category.

Therefore, \(\beta_0\) represents the average response among the reference category \(G = 1\). \(\beta_1\) represents the difference in the average response between groups \(G = 1\) and \(G = 2\). Therefore, \(\beta_0 + \beta_1\) is the average response in group \(G = 2\). A similar argument can be made about \(\beta_2\) and group 3.

Interactions between Categorical Variables

Code

Stata

. regress y i.sex##i.class

      Source |       SS           df       MS      Number of obs   =     3,000
-------------+----------------------------------   F(5, 2994)      =     92.91
       Model |  186898.199         5  37379.6398   Prob > F        =    0.0000
    Residual |  1204534.81     2,994  402.316235   R-squared       =    0.1343
-------------+----------------------------------   Adj R-squared   =    0.1329
       Total |  1391433.01     2,999  463.965657   Root MSE        =    20.058

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         sex |
     Female  |   21.62507   1.509999    14.32   0.000     18.66433    24.58581
             |
       class |
          2  |   11.37444   1.573314     7.23   0.000     8.289552    14.45932
          3  |    21.6188   1.588487    13.61   0.000     18.50416    24.73344
             |
   sex#class |
   Female#2  |  -4.851582   1.942744    -2.50   0.013     -8.66083   -1.042333
   Female#3  |  -6.084875   3.004638    -2.03   0.043    -11.97624   -.1935115
             |
       _cons |    50.6107   1.367932    37.00   0.000     47.92852    53.29288
------------------------------------------------------------------------------

. margins sex#class

Adjusted predictions                                     Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
   sex#class |
     Male#1  |    50.6107   1.367932    37.00   0.000     47.92852    53.29288
     Male#2  |   61.98514   .7772248    79.75   0.000     60.46119    63.50908
     Male#3  |    72.2295   .8074975    89.45   0.000     70.64619     73.8128
   Female#1  |   72.23577     .63942   112.97   0.000     70.98203    73.48952
   Female#2  |   78.75863   .9434406    83.48   0.000     76.90877    80.60849
   Female#3  |    87.7697   2.468947    35.55   0.000     82.92869     92.6107
------------------------------------------------------------------------------

R (emmeans)

summary(mod1 <- lm(y ~ sex*class, data = m))

Call:
lm(formula = y ~ sex * class, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-72.029 -13.285   0.464  13.741  67.964 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)        50.611      1.368  36.998  < 2e-16 ***
sexFemale          21.625      1.510  14.321  < 2e-16 ***
class2             11.374      1.573   7.230 6.12e-13 ***
class3             21.619      1.588  13.610  < 2e-16 ***
sexFemale:class2   -4.852      1.943  -2.497   0.0126 *  
sexFemale:class3   -6.085      3.005  -2.025   0.0429 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 20.06 on 2994 degrees of freedom
Multiple R-squared:  0.1343,    Adjusted R-squared:  0.1329 
F-statistic: 92.91 on 5 and 2994 DF,  p-value: < 2.2e-16
emmeans(mod1, ~ class + sex)
 class sex    emmean    SE   df lower.CL upper.CL
 1     Male     50.6 1.368 2994     47.9     53.3
 2     Male     62.0 0.777 2994     60.5     63.5
 3     Male     72.2 0.807 2994     70.6     73.8
 1     Female   72.2 0.639 2994     71.0     73.5
 2     Female   78.8 0.943 2994     76.9     80.6
 3     Female   87.8 2.469 2994     82.9     92.6

Confidence level used: 0.95 
test(emmeans(mod1, ~ class + sex))
 class sex    emmean    SE   df t.ratio p.value
 1     Male     50.6 1.368 2994  36.998  <.0001
 2     Male     62.0 0.777 2994  79.752  <.0001
 3     Male     72.2 0.807 2994  89.449  <.0001
 1     Female   72.2 0.639 2994 112.971  <.0001
 2     Female   78.8 0.943 2994  83.480  <.0001
 3     Female   87.8 2.469 2994  35.549  <.0001

R (ggeffects)

summary(mod1 <- lm(y ~ sex*class, data = m))

Call:
lm(formula = y ~ sex * class, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-72.029 -13.285   0.464  13.741  67.964 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)        50.611      1.368  36.998  < 2e-16 ***
sexFemale          21.625      1.510  14.321  < 2e-16 ***
class2             11.374      1.573   7.230 6.12e-13 ***
class3             21.619      1.588  13.610  < 2e-16 ***
sexFemale:class2   -4.852      1.943  -2.497   0.0126 *  
sexFemale:class3   -6.085      3.005  -2.025   0.0429 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 20.06 on 2994 degrees of freedom
Multiple R-squared:  0.1343,    Adjusted R-squared:  0.1329 
F-statistic: 92.91 on 5 and 2994 DF,  p-value: < 2.2e-16
ggeffect(mod1, c("class", "sex"))
# Predicted values of y

sex: Male

class | Predicted |       95% CI
--------------------------------
1     |     50.61 | 47.93, 53.29
2     |     61.99 | 60.46, 63.51
3     |     72.23 | 70.65, 73.81

sex: Female

class | Predicted |       95% CI
--------------------------------
1     |     72.24 | 70.98, 73.49
2     |     78.76 | 76.91, 80.61
3     |     87.77 | 82.93, 92.61

R (marginaleffects)

summary(mod1 <- lm(y ~ sex*class, data = m))

Call:
lm(formula = y ~ sex * class, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-72.029 -13.285   0.464  13.741  67.964 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)        50.611      1.368  36.998  < 2e-16 ***
sexFemale          21.625      1.510  14.321  < 2e-16 ***
class2             11.374      1.573   7.230 6.12e-13 ***
class3             21.619      1.588  13.610  < 2e-16 ***
sexFemale:class2   -4.852      1.943  -2.497   0.0126 *  
sexFemale:class3   -6.085      3.005  -2.025   0.0429 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 20.06 on 2994 degrees of freedom
Multiple R-squared:  0.1343,    Adjusted R-squared:  0.1329 
F-statistic: 92.91 on 5 and 2994 DF,  p-value: < 2.2e-16
predictions(mod1, newdata = datagrid(sex = unique, class = unique))

    sex class Estimate Std. Error     z Pr(>|z|)     S 2.5 % 97.5 %
 Male       1     50.6      1.368  37.0   <0.001 993.0  47.9   53.3
 Male       2     62.0      0.777  79.8   <0.001   Inf  60.5   63.5
 Male       3     72.2      0.807  89.4   <0.001   Inf  70.6   73.8
 Female     1     72.2      0.639 113.0   <0.001   Inf  71.0   73.5
 Female     2     78.8      0.943  83.5   <0.001   Inf  76.9   80.6
 Female     3     87.8      2.469  35.5   <0.001 917.1  82.9   92.6

Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, y, sex, class 
Type:  response 

Again, we can specify specific levels in the datagrid:

predictions(mod1, newdata = datagrid(sex = "Male", class = c(1, 3)))

  sex class Estimate Std. Error    z Pr(>|z|)     S 2.5 % 97.5 %
 Male     1     50.6      1.368 37.0   <0.001 993.0  47.9   53.3
 Male     3     72.2      0.807 89.4   <0.001   Inf  70.6   73.8

Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, y, sex, class 
Type:  response 

Math

Consider the simplest case, with two binary variables \(Z\) and \(K\). We fit the model

\[ E(Y|Z,K) = \beta_0 + \beta_1Z + \beta_2K + \beta_3ZK, \]

where \(ZK = 1\) only if both \(Z = 1\) and \(K = 1\).

This is functionally equivalent to defining a new variable \(L\),

\[ L = \begin{cases} 0, & K = 0\ \&\ Z = 0 \\ 1, & K = 0\ \&\ Z = 1 \\ 2, & K = 1\ \&\ Z = 0 \\ 3, & K = 1\ \&\ Z = 1, \end{cases} \]

and fitting the model

\[ E(Y|L) = \beta_0 + \beta_1l_1 + \beta_2l_2 + \beta_3l_3. \]

Continuous Variables

Code

Stata

. regress y i.sex c.age c.distance

      Source |       SS           df       MS      Number of obs   =     3,000
-------------+----------------------------------   F(3, 2996)      =    143.27
       Model |  174576.269         3  58192.0896   Prob > F        =    0.0000
    Residual |  1216856.74     2,996   406.16046   R-squared       =    0.1255
-------------+----------------------------------   Adj R-squared   =    0.1246
       Total |  1391433.01     2,999  463.965657   Root MSE        =    20.153

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         sex |
     Female  |   13.98753   .7768541    18.01   0.000     12.46431    15.51075
         age |  -.5038264   .0336538   -14.97   0.000    -.5698133   -.4378394
    distance |  -.0060725   .0020291    -2.99   0.003    -.0100511   -.0020939
       _cons |   83.13802   1.327228    62.64   0.000     80.53565    85.74039
------------------------------------------------------------------------------

. margins, at(age = (30 40))

Predictive margins                                       Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()
1._at: age = 30
2._at: age = 40

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   74.67056   .4941029   151.12   0.000     73.70175    75.63938
          2  |    69.6323   .3680117   189.21   0.000     68.91072    70.35388
------------------------------------------------------------------------------

. margins, at(age = (30 40) distance = (10 100))

Predictive margins                                       Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()
1._at: age      =  30
       distance =  10
2._at: age      =  30
       distance = 100
3._at: age      =  40
       distance =  10
4._at: age      =  40
       distance = 100

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         _at |
          1  |    74.9656   .5034959   148.89   0.000     73.97837    75.95283
          2  |   74.41907   .5014946   148.39   0.000     73.43576    75.40238
          3  |   69.92733   .3809974   183.54   0.000     69.18029    70.67438
          4  |   69.38081   .3774763   183.80   0.000     68.64067    70.12095
------------------------------------------------------------------------------

. margins sex, at(age = (30 40))

Predictive margins                                       Number of obs = 3,000
Model VCE: OLS

Expression: Linear prediction, predict()
1._at: age = 30
2._at: age = 40

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     _at#sex |
     1#Male  |   67.66747   .5597763   120.88   0.000     66.56989    68.76506
   1#Female  |     81.655   .6902601   118.30   0.000     80.30157    83.00843
     2#Male  |   62.62921   .5370234   116.62   0.000     61.57624    63.68218
   2#Female  |   76.61674   .5331296   143.71   0.000      75.5714    77.66208
------------------------------------------------------------------------------

R (emmeans)

Note that any variable specified in the at option must also appear in the formula. Any variables you place in the formula but not at will be examined at each unique level (so don’t put a continuous variable there!).

summary(mod1 <- lm(y ~ sex + age + distance, data = m))

Call:
lm(formula = y ~ sex + age + distance, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-72.356 -13.792  -0.275  13.624  74.950 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 83.138023   1.327228  62.640  < 2e-16 ***
sexFemale   13.987531   0.776854  18.005  < 2e-16 ***
age         -0.503826   0.033654 -14.971  < 2e-16 ***
distance    -0.006073   0.002029  -2.993  0.00279 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 20.15 on 2996 degrees of freedom
Multiple R-squared:  0.1255,    Adjusted R-squared:  0.1246 
F-statistic: 143.3 on 3 and 2996 DF,  p-value: < 2.2e-16
emmeans(mod1, ~ age, at = list(age = c(30, 40)))
 age emmean    SE   df lower.CL upper.CL
  30   74.7 0.494 2996     73.7     75.6
  40   69.6 0.368 2996     68.9     70.3

Results are averaged over the levels of: sex 
Confidence level used: 0.95 
test(emmeans(mod1, ~ age, at = list(age = c(30, 40))))
 age emmean    SE   df t.ratio p.value
  30   74.7 0.494 2996 151.138  <.0001
  40   69.6 0.368 2996 189.185  <.0001

Results are averaged over the levels of: sex 
emmeans(mod1, ~ age + distance, at = list(age = c(30, 40), distance = c(10,100)))
 age distance emmean    SE   df lower.CL upper.CL
  30       10   75.0 0.503 2996     74.0     75.9
  40       10   69.9 0.381 2996     69.2     70.7
  30      100   74.4 0.501 2996     73.4     75.4
  40      100   69.4 0.377 2996     68.6     70.1

Results are averaged over the levels of: sex 
Confidence level used: 0.95 
emmeans(mod1, ~ age + sex, at = list(age = c(30, 40)))
 age sex    emmean    SE   df lower.CL upper.CL
  30 Male     67.7 0.560 2996     66.6     68.8
  40 Male     62.6 0.537 2996     61.6     63.7
  30 Female   81.7 0.690 2996     80.3     83.0
  40 Female   76.6 0.533 2996     75.6     77.7

Confidence level used: 0.95 

R (ggeffects)

Note the particular syntax for specifying particular values - a quoted string containing the name of the variable, followed by a list in square brackets of comma-separated values (a single value would be just [4]). Whitespace is ignored.

summary(mod1 <- lm(y ~ sex + age + distance, data = m))

Call:
lm(formula = y ~ sex + age + distance, data = m)

Residuals:
    Min      1Q  Median      3Q     Max 
-72.356 -13.792  -0.275  13.624  74.950 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 83.138023   1.327228  62.640  < 2e-16 ***
sexFemale   13.987531   0.776854  18.005  < 2e-16 ***
age         -0.503826   0.033654 -14.971  < 2e-16 ***
distance    -0.006073   0.002029  -2.993  0.00279 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 20.15 on 2996 degrees of freedom
Multiple R-squared:  0.1255,    Adjusted R-squared:  0.1246 
F-statistic: 143.3 on 3 and 2996 DF,  p-value: < 2.2e-16
ggeffect(mod1, "age [30, 40]")
# Predicted values of y

age | Predicted |       95% CI
------------------------------
 30 |     74.67 | 73.70, 75.64
 40 |     69.63 | 68.91, 70.35
ggeffect(mod1, c("age [30, 40]", "distance [10, 100]"))
# Predicted values of y

dist