simdat {svycdiff} | R Documentation |
Function to simulate data based on specified relationships between the generated (continuous) outcome, variable of interest, confounder, and selection mechanism.
simdat(
N,
X_dist = "continuous",
S_known = FALSE,
tau_0 = 0,
tau_X = 1,
beta_0 = 0,
beta_A = 1,
beta_X = 1,
hetero = TRUE,
alpha_0 = 0,
alpha_X = 1,
alpha_A = 1,
alpha_AX = 0.1
)
N |
int - Number of observations to be generated |
X_dist |
string - Distribution of the confounding variable, X. Defaults to "continuous" for a N(1, 1) variable, or "binary" for a Bernoulli(0.5) variable |
S_known |
boolean - Logical for whether the selection mechanism should be treated as known (deterministic) or needs to be estimated (simulated with Gaussian error; defaults to FALSE) |
tau_0 |
double - Intercept for propensity model (defaults to 0) |
tau_X |
double - Coefficient for X in propensity model (defaults to 1) |
beta_0 |
double - Intercept for selection model (defaults to 0) |
beta_A |
double - Coefficient for A in selection model (defaults to 1) |
beta_X |
double - Coefficient for X in selection model (defaults to 1) |
hetero |
boolean - Logical for heterogeneous treatment effect in the outcome model (defaults to TRUE) |
alpha_0 |
double - Intercept for outcome model (defaults to 0) |
alpha_X |
double - Coefficient for X in outcome model (defaults to 1) |
alpha_A |
double - Coefficient for A in outcome model (defaults to 1) |
alpha_AX |
double - Coefficient for interaction between A and X in
outcome model (only used if |
The data are generated as follows. For a user-given number, N
,
observations in our so-called super population, we first generate a
confounding variable, X
, which relates to our outcome, Y
, our
variable of interest, A
, and our selection indicator, S
.
We generate population-level data with X ~ N(1,1)
or
X ~ Bern(0.5)
depending on whether distribution of X
is
chosen to be X_dist = "continous"
or X_dist = "binary"
,
respectively.
We then generate the remaining data from three models:
A data.frame
with N
observations of 7 variables:
Observed outcome (continuous)
Comparison group variable of interest (binary)
Confounding variable (continuous or binary)
True probability of A = 1 conditional on X (continuous)
True probability of selection (S = 1) conditional on A and X (continuous)
True probability of selection (S = 1) conditional on A = 1 and X (continuous)
True probability of selection (S = 1) conditional on A = 0 and X (continuous)
True controlled difference in outcomes by comparison group (double)
N <- 100000
dat <- simdat(N)
head(dat)