Statistics Notes
Marginal Effects: A long-form document on using marginal effects (marginal means and marginal slopes) to improve interpretation of regression coefficients, especially in the presence of an interaction. Presents as a Rosetta stone for Stata’s margins
command and R. Software: Stata and R.
Benjamini–Hochberg Procedure: An interactive calculator to apply the Benjamini–Hochberg multiple comparison correction for a series of p-values.
Interpreting log transformations in regression: Interpreting coefficients from a linear regression model (or linear mixed model) when the predictor and/or the outcome have been log transformed. Software: R, though concepts could transfer to other software.
Random intercepts in SPSS’s Mixed Model: There are two ways of specifying random intercepts in SPSS’s Mixed Model; this discusses their equivalence. Software: SPSS.
Moderation and Mediation via Regression: Although moderation and mediation typically arise in SEM/path analysis frameworks, moderation can be addressed in regression, and mediation can be conceptualized through regression.
IRR and ICC: Some notes on IRR versus ICC, as well as how to obtain the ICC. Software: R, though concepts could transfer to other software.
The issue of collinear predictors: A visualization to see the potential negative effect of including highly collinear variables in a model. Software: R, though concepts could transfer to other software.
Stata, xt
versus mixed model: Econometricians usually use the xt
framework to address repeated measures. This document shows how to fit the equivalent of fixed effects regression, between effects regression, and random effects regression using linear regression (regress
) and linear mixed models (mixed
). Software: Stata.
Nested versus crossed random effects: Multiple random effects in a mixed model are typically defined as either “nested” or “crossed”. This document shows that this is a false dichotomy (nested random effects aren’t real!), as well as showing some nice visualization. Software: R, though concepts could transfer to other software.
Linear splines versus interactions: Models such as interrupted time series or diff-in-diff are often considered special analyses. This document attempts to show that these are just regression models with particular interactions. Software: Stata, though concepts could transfer to other software.
Selecting a random subset of the data, potentially within subgroups: An easy way to generate a random sample of your data of arbitrary size, including a stratified approach. Software: Stata, though concepts could transfer to other software.
Mediation with Survey data: To fit a mediation model in Stata using complex survey data requires using gsem
. Unfortunately, the gsem
command does not support directly estimating direct, indirect and total effects. This documents how to compute them. Software: Stata.
The “Divide by 4” rule: An additional tool in interpreting logistic regression coefficients is the “Divide by 4” rule. Software: Stata, though concepts could transfer to other software.
Response Surface Model Plotting: An example of visualization after running a “response surface model”, aka just regression with an interaction. Software: R.
Modifying built-in-Stata commands: A fun story about hacking one of Stata’s shipped-with commands to make some code work. A story rather than a guide or instructions. Software: Stata.