seqaddNA {seqimpute} | R Documentation |
Generation of missing on longitudinal categorical data.
Description
Generation of missing data under the form of gaps, which is the typical form of missing data with longitudinal data. It simulates MCAR or MAR missing data.
Usage
seqaddNA(
data,
var = NULL,
states.high = NULL,
propdata = 1,
pstart.high = 0.1,
pstart.low = 0.005,
maxgap = 3,
maxprop = 0.75,
only.traj = FALSE
)
Arguments
data |
A data frame containing sequences of a categorical (multinomial)
variable, where missing data are coded as |
var |
A vector specifying the columns of the dataset
that contain the trajectories. Default is |
states.high |
A list of states with a higher probability of initiating a subsequent missing data gap. |
propdata |
Proportion of observations for which missing data is simulated, as a decimal between 0 and 1. |
pstart.high |
Probability of starting a missing data gap for the
states specified in the |
pstart.low |
Probability of starting a missing data gap for all other states. |
maxgap |
Maximum length of a missing data gap. |
maxprop |
Maximum proportion of missing data allowed in a sequence, as a decimal between 0 and 1. If the proportion exceeds this value, the simulation is rerun for the sequence. |
only.traj |
Logical, if |
Value
A data frame with simulated missing data.
Author(s)
Kevin Emery
Examples
# Generate MCAR missing data on the mvad dataset
# from the TraMineR package
## Not run:
data(mvad, package = "TraMineR")
mvad.miss <- seqaddNA(mvad, var = 17:86)
# Generate missing data on mvad where joblessness is more likely to trigger
# a missing data gap
mvad.miss2 <- seqaddNA(mvad, var = 17:86, states.high = "joblessness")
## End(Not run)