Skip to contents

The old advice of 5-10 imputations is sufficient for a point estimate (e.g. an estimated coefficient), but not for estimates of standard errors (and consequently, hypothesis tests or confidence intervals).

Usage

how_many_imputations(model, cv = 0.05, alpha = 0.05)

Arguments

model

Either a mira object (created by running a model on a data set which was imputed using mice::mice()) or a mipo object (creating by running pool() on a mira object), or any object which can be converted to mira via as.mira().

cv

Desired precision of standard errors. Default to .05. If the data were re-imputed, the estimated standard errors would differ by no more than this amount.

alpha

Significance level for choice of "conservative" FMI.

Value

The number of required imputations to obtain the cv level of precision.

Details

von Hippel (2020) provides a way to calculate the number of imputations needed to have consistent estimates of the standard error. To do so requires an estimate of the Fraction of Missing Information (FMI) which can only be obtained after running some number of imputations. Therefore, von Hippel (2020) recommends the following procedure:

  1. Carry out a limited number of imputations to enable estimation of the FMI. von Hippel (2020) recommends 20 imputations.

  2. Use this function, how_many_imputations(), to calculate how many total imputations you will need.

  3. If the number of total imputations you will need is larger than your initial batch of 20, run additional imputations.

References

von Hippel, Paul T. "How Many Imputations Do You Need? A Two-stage Calculation Using a Quadratic Rule." Sociological Methods & Research 49.3 (2020): 699-718.

Examples

data(airquality)
# Add some missingness
airquality[4:10, 3] <- rep(NA, 7)
airquality[1:5, 4] <- NA
airquality <- airquality[-c(5, 6)]
impdata1 <- mice::mice(airquality, m = 5, maxit = 10, method = 'pmm', seed = 500)
#> 
#>  iter imp variable
#>   1   1  Ozone  Solar.R  Wind  Temp
#>   1   2  Ozone  Solar.R  Wind  Temp
#>   1   3  Ozone  Solar.R  Wind  Temp
#>   1   4  Ozone  Solar.R  Wind  Temp
#>   1   5  Ozone  Solar.R  Wind  Temp
#>   2   1  Ozone  Solar.R  Wind  Temp
#>   2   2  Ozone  Solar.R  Wind  Temp
#>   2   3  Ozone  Solar.R  Wind  Temp
#>   2   4  Ozone  Solar.R  Wind  Temp
#>   2   5  Ozone  Solar.R  Wind  Temp
#>   3   1  Ozone  Solar.R  Wind  Temp
#>   3   2  Ozone  Solar.R  Wind  Temp
#>   3   3  Ozone  Solar.R  Wind  Temp
#>   3   4  Ozone  Solar.R  Wind  Temp
#>   3   5  Ozone  Solar.R  Wind  Temp
#>   4   1  Ozone  Solar.R  Wind  Temp
#>   4   2  Ozone  Solar.R  Wind  Temp
#>   4   3  Ozone  Solar.R  Wind  Temp
#>   4   4  Ozone  Solar.R  Wind  Temp
#>   4   5  Ozone  Solar.R  Wind  Temp
#>   5   1  Ozone  Solar.R  Wind  Temp
#>   5   2  Ozone  Solar.R  Wind  Temp
#>   5   3  Ozone  Solar.R  Wind  Temp
#>   5   4  Ozone  Solar.R  Wind  Temp
#>   5   5  Ozone  Solar.R  Wind  Temp
#>   6   1  Ozone  Solar.R  Wind  Temp
#>   6   2  Ozone  Solar.R  Wind  Temp
#>   6   3  Ozone  Solar.R  Wind  Temp
#>   6   4  Ozone  Solar.R  Wind  Temp
#>   6   5  Ozone  Solar.R  Wind  Temp
#>   7   1  Ozone  Solar.R  Wind  Temp
#>   7   2  Ozone  Solar.R  Wind  Temp
#>   7   3  Ozone  Solar.R  Wind  Temp
#>   7   4  Ozone  Solar.R  Wind  Temp
#>   7   5  Ozone  Solar.R  Wind  Temp
#>   8   1  Ozone  Solar.R  Wind  Temp
#>   8   2  Ozone  Solar.R  Wind  Temp
#>   8   3  Ozone  Solar.R  Wind  Temp
#>   8   4  Ozone  Solar.R  Wind  Temp
#>   8   5  Ozone  Solar.R  Wind  Temp
#>   9   1  Ozone  Solar.R  Wind  Temp
#>   9   2  Ozone  Solar.R  Wind  Temp
#>   9   3  Ozone  Solar.R  Wind  Temp
#>   9   4  Ozone  Solar.R  Wind  Temp
#>   9   5  Ozone  Solar.R  Wind  Temp
#>   10   1  Ozone  Solar.R  Wind  Temp
#>   10   2  Ozone  Solar.R  Wind  Temp
#>   10   3  Ozone  Solar.R  Wind  Temp
#>   10   4  Ozone  Solar.R  Wind  Temp
#>   10   5  Ozone  Solar.R  Wind  Temp
modelFit1 <- with(impdata1, lm(Temp ~ Ozone + Solar.R + Wind))
how_many_imputations(modelFit1)
#> [1] 57
how_many_imputations(modelFit1, cv = .01)
#> [1] 1394

# Using a non-`mice` library.
library(jomo)
library(mitools) # for the `imputationList` function
jomodata <- jomo::jomo1(airquality, nburn = 100, nbetween = 100, nimp = 5)
#> Found  4 continuous outcomes and no categorical. Using function jomo1con. 
#> ..........First imputation registered. 
#> ..........Imputation number  2 registered 
#> ..........Imputation number  3 registered 
#> ..........Imputation number  4 registered 
#> ..........Imputation number  5 registered 
#> The posterior mean of the fixed effects estimates is:
#>                 X1
#> Ozone    42.093364
#> Solar.R 185.364121
#> Wind      9.866024
#> Temp     78.221961
#> 
#> The posterior covariance matrix is:
#>              Ozone      Solar.R        Wind      Temp
#> Ozone   1099.42722  973.7760494 -65.5499207 211.45136
#> Solar.R  973.77605 8439.0112912   0.3907474 246.44545
#> Wind     -65.54992    0.3907474  12.2804579 -14.33819
#> Temp     211.45136  246.4454500 -14.3381912  88.44187
impdata2 <- mitools::imputationList(split(jomodata, jomodata$Imputation))
modelfit2 <- with(impdata2, lm(Temp ~ Ozone + Solar.R + Wind))
how_many_imputations(modelfit2)
#> [1] 43