Implements two-stage "how_many_imputations" from von Hippel (2020)
Source:R/how_many_imputations.R
how_many_imputations.Rd
The old advice of 5-10 imputations is sufficient for a point estimate (e.g. an estimated coefficient), but not for estimates of standard errors (and consequently, hypothesis tests or confidence intervals).
Usage
how_many_imputations(model, cv = 0.05, alpha = 0.05)
howManyImputations(model, cv = 0.05, alpha = 0.05)
Arguments
- model
Either a
mira
object (created by running a model on a data set which was imputed usingmice::mice()
) or amipo
object (creating by runningpool()
on amira
object), or any object which can be converted tomira
viaas.mira()
.- cv
Desired precision of standard errors. Default to .05. If the data were re-imputed, the estimated standard errors would differ by no more than this amount.
- alpha
Significance level for choice of "conservative" FMI.
Details
von Hippel (2020) provides a way to calculate the number of imputations needed to have consistent estimates of the standard error. To do so requires an estimate of the Fraction of Missing Information (FMI) which can only be obtained after running some number of imputations. Therefore, von Hippel (2020) recommends the following procedure:
Carry out a limited number of imputations to enable estimation of the FMI. von Hippel (2020) recommends 20 imputations.
Use this function,
how_many_imputations()
, to calculate how many total imputations you will need.If the number of total imputations you will need is larger than your initial batch of 20, run additional imputations.
References
von Hippel, Paul T. "How Many Imputations Do You Need? A Two-stage Calculation Using a Quadratic Rule." Sociological Methods & Research 49.3 (2020): 699-718.
Examples
data(airquality)
# Add some missingness
airquality[4:10, 3] <- rep(NA, 7)
airquality[1:5, 4] <- NA
airquality <- airquality[-c(5, 6)]
impdata1 <- mice::mice(airquality, m = 5, maxit = 10,
method = 'pmm', seed = 500)
#>
#> iter imp variable
#> 1 1 Ozone Solar.R Wind Temp
#> 1 2 Ozone Solar.R Wind Temp
#> 1 3 Ozone Solar.R Wind Temp
#> 1 4 Ozone Solar.R Wind Temp
#> 1 5 Ozone Solar.R Wind Temp
#> 2 1 Ozone Solar.R Wind Temp
#> 2 2 Ozone Solar.R Wind Temp
#> 2 3 Ozone Solar.R Wind Temp
#> 2 4 Ozone Solar.R Wind Temp
#> 2 5 Ozone Solar.R Wind Temp
#> 3 1 Ozone Solar.R Wind Temp
#> 3 2 Ozone Solar.R Wind Temp
#> 3 3 Ozone Solar.R Wind Temp
#> 3 4 Ozone Solar.R Wind Temp
#> 3 5 Ozone Solar.R Wind Temp
#> 4 1 Ozone Solar.R Wind Temp
#> 4 2 Ozone Solar.R Wind Temp
#> 4 3 Ozone Solar.R Wind Temp
#> 4 4 Ozone Solar.R Wind Temp
#> 4 5 Ozone Solar.R Wind Temp
#> 5 1 Ozone Solar.R Wind Temp
#> 5 2 Ozone Solar.R Wind Temp
#> 5 3 Ozone Solar.R Wind Temp
#> 5 4 Ozone Solar.R Wind Temp
#> 5 5 Ozone Solar.R Wind Temp
#> 6 1 Ozone Solar.R Wind Temp
#> 6 2 Ozone Solar.R Wind Temp
#> 6 3 Ozone Solar.R Wind Temp
#> 6 4 Ozone Solar.R Wind Temp
#> 6 5 Ozone Solar.R Wind Temp
#> 7 1 Ozone Solar.R Wind Temp
#> 7 2 Ozone Solar.R Wind Temp
#> 7 3 Ozone Solar.R Wind Temp
#> 7 4 Ozone Solar.R Wind Temp
#> 7 5 Ozone Solar.R Wind Temp
#> 8 1 Ozone Solar.R Wind Temp
#> 8 2 Ozone Solar.R Wind Temp
#> 8 3 Ozone Solar.R Wind Temp
#> 8 4 Ozone Solar.R Wind Temp
#> 8 5 Ozone Solar.R Wind Temp
#> 9 1 Ozone Solar.R Wind Temp
#> 9 2 Ozone Solar.R Wind Temp
#> 9 3 Ozone Solar.R Wind Temp
#> 9 4 Ozone Solar.R Wind Temp
#> 9 5 Ozone Solar.R Wind Temp
#> 10 1 Ozone Solar.R Wind Temp
#> 10 2 Ozone Solar.R Wind Temp
#> 10 3 Ozone Solar.R Wind Temp
#> 10 4 Ozone Solar.R Wind Temp
#> 10 5 Ozone Solar.R Wind Temp
modelFit1 <- with(impdata1, lm(Temp ~ Ozone + Solar.R + Wind))
how_many_imputations(modelFit1)
#> [1] 57
how_many_imputations(modelFit1, cv = .01)
#> [1] 1394
# Using a non-`mice` libraries.
library(jomo)
library(mitools) # for the `imputationList` function
jomodata <- jomo::jomo1(airquality, nburn = 100, nbetween = 100, nimp = 5)
#> Found 4 continuous outcomes and no categorical. Using function jomo1con.
#> ..........First imputation registered.
#> ..........Imputation number 2 registered
#> ..........Imputation number 3 registered
#> ..........Imputation number 4 registered
#> ..........Imputation number 5 registered
#> The posterior mean of the fixed effects estimates is:
#> X1
#> Ozone 41.993606
#> Solar.R 185.024961
#> Wind 9.911321
#> Temp 78.190098
#>
#> The posterior covariance matrix is:
#> Ozone Solar.R Wind Temp
#> Ozone 1094.22277 1005.712450 -65.909864 211.69824
#> Solar.R 1005.71245 8446.240353 -1.810144 256.93195
#> Wind -65.90986 -1.810144 12.341253 -14.50923
#> Temp 211.69824 256.931946 -14.509226 89.29333
impdata2 <- mitools::imputationList(split(jomodata, jomodata$Imputation))
modelfit2 <- with(impdata2, lm(Temp ~ Ozone + Solar.R + Wind))
how_many_imputations(modelfit2)
#> [1] 45
library(Amelia)
#> Loading required package: Rcpp
#> ##
#> ## Amelia II: Multiple Imputation
#> ## (Version 1.8.1, built: 2022-11-18)
#> ## Copyright (C) 2005-2024 James Honaker, Gary King and Matthew Blackwell
#> ## Refer to http://gking.harvard.edu/amelia/ for more information
#> ##
data(freetrade)
a.out <- amelia(freetrade, m = 20, ts = "year", cs = "country")
#> -- Imputation 1 --
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
#>
#> -- Imputation 2 --
#>
#> 1 2 3 4 5 6 7 8 9 10
#>
#> -- Imputation 3 --
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12
#>
#> -- Imputation 4 --
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
#>
#> -- Imputation 5 --
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13
#>
#> -- Imputation 6 --
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
#>
#> -- Imputation 7 --
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
#>
#> -- Imputation 8 --
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
#>
#> -- Imputation 9 --
#>
#> 1 2 3 4 5 6 7 8 9 10
#>
#> -- Imputation 10 --
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
#>
#> -- Imputation 11 --
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13
#>
#> -- Imputation 12 --
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
#>
#> -- Imputation 13 --
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
#>
#> -- Imputation 14 --
#>
#> 1 2 3 4 5 6 7 8 9 10 11
#>
#> -- Imputation 15 --
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
#>
#> -- Imputation 16 --
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
#>
#> -- Imputation 17 --
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
#>
#> -- Imputation 18 --
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14
#>
#> -- Imputation 19 --
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
#>
#> -- Imputation 20 --
#>
#> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
#>
modelFit3 <- with(imputationList(a.out$imputations),
lm(tariff ~ polity + pop + gdp.pc + year + country))
how_many_imputations(modelFit3)
#> [1] 112