extrapolate_coverage {simhelpers} | R Documentation |
Extrapolate coverage and width using sub-sampled bootstrap confidence intervals.
Description
Given a set of bootstrap confidence intervals calculated across sub-samples with different numbers of replications, extrapolates confidence interval coverage and width of bootstrap confidence intervals to a specified (larger) number of bootstraps. The function also calculates the associated Monte Carlo standard errors. The confidence interval percentage is based on how you calculated the lower and upper bounds.
Usage
extrapolate_coverage(
data,
CI_subsamples,
true_param,
B_target = Inf,
criteria = c("coverage", "width"),
winz = Inf,
nested = FALSE,
format = "wide",
width_trim = 0,
cover_na_val = NA,
width_na_val = NA
)
Arguments
data |
data frame or tibble containing the simulation results. |
CI_subsamples |
list or name of column from |
true_param |
vector or name of column from |
B_target |
number of bootstrap replications to which the criteria should
be extrapolated, with a default of |
criteria |
character or character vector indicating the performance
criteria to be calculated, with possible options |
winz |
numeric value for winsorization constant. If set to a finite
value, estimates will be winsorized at the constant multiple of the
inter-quartile range below the 25th percentile or above the 75th percentile
of the distribution. For instance, setting |
nested |
logical value controlling the format of the output. If
|
format |
character string controlling the format of the output when
|
width_trim |
numeric value specifying the trimming percentage to use when summarizing CI widths across replications from a single set of bootstraps, with a default of 0.0 (i.e., use the regular arithmetic mean). |
cover_na_val |
numeric value to use for calculating coverage if bootstrap CI end-points are missing. Default is |
width_na_val |
numeric value to use for calculating width if bootstrap CI end-points are missing. Default is |
Value
A tibble containing the number of simulation iterations, performance criteria estimate(s) and the associated MCSE.
References
Boos DD, Zhang J (2000). “Monte Carlo evaluation of resampling-based hypothesis tests.” Journal of the American Statistical Association, 95(450), 486–492. doi:10.1080/01621459.2000.10474226.
Examples
dgp <- function(N, mu, nu) {
mu + rt(N, df = nu)
}
estimator <- function(
dat,
B_vals = c(49,59,89,99),
m = 4,
trim = 0.1
) {
# compute estimate and standard error
N <- length(dat)
est <- mean(dat, trim = trim)
se <- sd(dat) / sqrt(N)
# compute booties
booties <- replicate(max(B_vals), {
x <- sample(dat, size = N, replace = TRUE)
data.frame(
M = mean(x, trim = trim),
SE = sd(x) / sqrt(N)
)
}, simplify = FALSE) |>
dplyr::bind_rows()
# confidence intervals for each B_vals
CIs <- bootstrap_CIs(
boot_est = booties$M,
boot_se = booties$SE,
est = est,
se = se,
CI_type = c("normal","basic","student","percentile"),
B_vals = B_vals,
reps = m,
format = "wide-list"
)
res <- data.frame(
est = est,
se = se
)
res$CIs <- CIs
res
}
#' build a simulation driver function
simulate_bootCIs <- bundle_sim(
f_generate = dgp,
f_analyze = estimator
)
boot_results <- simulate_bootCIs(
reps = 50, N = 20, mu = 2, nu = 3,
B_vals = seq(49, 199, 50),
)
extrapolate_coverage(
data = boot_results,
CI_subsamples = CIs,
true_param = 2
)
extrapolate_coverage(
data = boot_results,
CI_subsamples = CIs,
true_param = 2,
B_target = 999,
format = "long"
)