DIDparams {cdid} | R Documentation |
DIDparams
Description
Creates a DIDparams
object to hold parameters for difference-in-differences analysis,
including data structure details and user-specified options. This object is designed to streamline
parameter passing across functions in the cdid
package.#'
Usage
DIDparams(
yname,
tname,
idname = NULL,
gname,
xformla = NULL,
data,
control_group,
anticipation = 0,
weightsname = NULL,
alp = 0.05,
bstrap = TRUE,
biters = 1000,
clustervars = NULL,
cband = TRUE,
print_details = TRUE,
pl = FALSE,
cores = 1,
est_method = "chained",
base_period = "varying",
panel = TRUE,
true_repeated_cross_sections,
n = NULL,
nG = NULL,
nT = NULL,
tlist = NULL,
glist = NULL,
call = NULL
)
Arguments
yname |
The name of the outcome variable |
tname |
The name of the column containing the time periods |
idname |
The individual (cross-sectional unit) id name |
gname |
The name of the variable in |
xformla |
A formula for the covariates to include in the
model. It should be of the form |
data |
The name of the data.frame that contains the data |
control_group |
Which units to use the control group.
The default is "nevertreated" which sets the control group
to be the group of units that never participate in the
treatment. This group does not change across groups or
time periods. The other option is to set
|
anticipation |
(Not used) The number of time periods before participating in the treatment where units can anticipate participating in the treatment and therefore it can affect their untreated potential outcomes |
weightsname |
The name of the column containing the sampling weights. If not set, all observations have same weight. |
alp |
the significance level, default is 0.05 |
bstrap |
Boolean for whether or not to compute standard errors using
the multiplier bootstrap. If standard errors are clustered, then one
must set |
biters |
The number of bootstrap iterations to use. The default is 1000,
and this is only applicable if |
clustervars |
A vector of variables names to cluster on. At most, there
can be two variables (otherwise will throw an error) and one of these
must be the same as idname which allows for clustering at the individual
level. By default, we cluster at individual level (when |
cband |
Boolean for whether or not to compute a uniform confidence
band that covers all of the group-time average treatment effects
with fixed probability |
print_details |
Whether or not to show details/progress of computations.
Default is |
pl |
Whether or not to use parallel processing |
cores |
The number of cores to use for parallel processing |
est_method |
the method to compute group-time average treatment effects. At the moment, one can only use the IPW estimator with either "2-step" or "Identity" weighting matrix to aggregate Delta ATT into ATT. include "ipw" for inverse probability weighting and "reg" for first step regression estimators. |
base_period |
(Not used) The cdid package only uses the g-1 base period for the moment. Whether to use a "varying" base period or a "universal" base period. Either choice results in the same post-treatment estimates of ATT(g,t)'s. In pre-treatment periods, using a varying base period amounts to computing a pseudo-ATT in each treatment period by comparing the change in outcomes for a particular group relative to its comparison group in the pre-treatment periods (i.e., in pre-treatment periods this setting computes changes from period t-1 to period t, but repeatedly changes the value of t) A universal base period fixes the base period to always be (g-anticipation-1). This does not compute pseudo-ATT(g,t)'s in pre-treatment periods, but rather reports average changes in outcomes from period t to (g-anticipation-1) for a particular group relative to its comparison group. This is analogous to what is often reported in event study regressions. Using a varying base period results in an estimate of ATT(g,t) being reported in the period immediately before treatment. Using a universal base period normalizes the estimate in the period right before treatment (or earlier when the user allows for anticipation) to be equal to 0, but one extra estimate in an earlier period. |
panel |
(Not used) This is not used as balanced and unbalanced panel data is treated similarly. |
true_repeated_cross_sections |
Whether or not the data really is repeated cross sections. (We include this because unbalanced panel code runs through the repeated cross sections code) |
n |
The number of observations. This is equal to the number of units (which may be different from the number of rows in a panel dataset). |
nG |
The number of groups |
nT |
The number of time periods |
tlist |
a vector containing each time period |
glist |
a vector containing each group |
call |
(Not used) a call control var |
Value
A DIDparams
object, which is a list containing the following elements:
-
yname
: The name of the outcome variable. -
tname
: The name of the time variable. -
idname
: The name of the unit identifier variable (if applicable). -
gname
: The name of the group variable (e.g., treatment group). -
xformla
: A formula specifying covariates for the model. -
data
: The dataset used for analysis. -
control_group
: The type of control group (e.g., "never treated" or "not yet treated"). -
anticipation
: The number of periods of anticipation before treatment. -
weightsname
: The name of the variable containing sampling weights (if applicable). -
alp
: The significance level (default is 0.05). -
bstrap
: Logical. Indicates whether bootstrap is used for standard errors. -
biters
: The number of bootstrap iterations (if bootstrap is enabled). -
clustervars
: Variables used for clustering standard errors. -
cband
: Logical. Indicates whether simultaneous confidence bands are computed. -
print_details
: Logical. Indicates whether detailed results should be printed. -
pl
: Logical. Parallelization flag for computations. -
cores
: The number of cores to use for parallelization (if enabled). -
est_method
: The estimation method (e.g., "chained"). -
base_period
: The base period used for comparison (e.g., "varying"). -
panel
: Logical. Indicates whether the data is a panel dataset. -
true_repeated_cross_sections
: Logical. Indicates whether the data is truly repeated cross-sections. -
n
: The number of observations (units). -
nG
: The number of groups. -
nT
: The number of time periods. -
tlist
: A vector containing all time periods. -
glist
: A vector containing all groups. -
call
: The call that generated theDIDparams
object.