seqaddNA {seqimpute} | R Documentation |
Generation of missing data under the form of gaps, which is the typical form of missing data with longitudinal data. It simulates MCAR or MAR missing data.
seqaddNA(
data,
var = NULL,
states.high = NULL,
propdata = 1,
pstart.high = 0.1,
pstart.low = 0.005,
maxgap = 3,
maxprop = 0.75,
only.traj = FALSE
)
data |
A data frame containing sequences of a categorical (multinomial)
variable, where missing data are coded as |
var |
A vector specifying the columns of the dataset
that contain the trajectories. Default is |
states.high |
A list of states with a higher probability of initiating a subsequent missing data gap. |
propdata |
Proportion of observations for which missing data is simulated, as a decimal between 0 and 1. |
pstart.high |
Probability of starting a missing data gap for the
states specified in the |
pstart.low |
Probability of starting a missing data gap for all other states. |
maxgap |
Maximum length of a missing data gap. |
maxprop |
Maximum proportion of missing data allowed in a sequence, as a decimal between 0 and 1. If the proportion exceeds this value, the simulation is rerun for the sequence. |
only.traj |
Logical, if |
A data frame with simulated missing data.
Kevin Emery
# Generate MCAR missing data on the mvad dataset
# from the TraMineR package
## Not run:
data(mvad, package = "TraMineR")
mvad.miss <- seqaddNA(mvad, var = 17:86)
# Generate missing data on mvad where joblessness is more likely to trigger
# a missing data gap
mvad.miss2 <- seqaddNA(mvad, var = 17:86, states.high = "joblessness")
## End(Not run)