dataexample.missingdata.stratified {CaseCohortCoxSurvival}R Documentation

Example of case-cohort with stratified sampling of the subcohort and missing covariate information in phase-two data

Description

List with cohort.

cohort is a simulated cohort with 20 000 subjects. It contains:

id is the subject identifier.

X1 is a continuous baseline covariate. Its measurements are only available for subjects in the case-cohort, i.e., with phase3 = 1.

X2 is a categorical baseline covariate, with categories 0, 1, and 2. It is measured on all cohort subjects.

X3 is a continuous baseline covariate. Its measurements are only available for subjects in the case-cohort.

W is a baseline categorical variable, with categories 0, 1, 2, and 3. It depends on predictors of X1 and X2. It is measured on all cohort subjects.

status indicates case status.

event.time gives the event or censoring time. status indicates whether the subject experienced the event of interest or was censored.

The stratified sampling of the subcohort was based on the 4 strata defined by W. 97, 294, 300, and 380 subjects were sampled (independently of case status) from the 4 strata, respectively. subcohort indicates all these subjects included in the subcohort.

The phase-two sample consisted of the subcohort and any other cases not in the subcohort. phase2 indicates all these subjects included in the phase-two sample.

W3 is a baseline binary variable, based on case status. It is measured on all cohort subjects.

The third phase of sampling was stratified based on the 2 strata defined by W3. Subjects were sampled from the 2 strata with sampling probabilities 0.9 and 0.8. phase3 indicates all these subjects included in the case-cohort (phase-three sample).

strata.n gives the number of subjects in the stratum in the cohort.

strata.m gives the number of subjects sampled from each of the 4 phase-two strata to be included in the subcohort (i.e., 97, 294, 300, or 380).

strata.m and strata.n would be used to compute the phase-two design weights of non-cases. Because all the cases were included in the phase-two sample, they would be assigned a phase-two design weight of 1.

strata.n.cases gives the number of cases in each of the 4 phase-two strata in the cohort.

n.cases gives the number of cases in the entire cohort.

strata.proba.missing gives the the sampling probablity for the 2 phase-three strata based on W3 and that were used for the third phase of sampling.

weight.true gives the true design weight (i.e., product of the phase-two and true phase-three design weight).

weight.p2.true gives true phase-two design weight. They are stratum-specific based on W.

weight.p3.true gives the true phase-three design weight. They are stratum-specific based on W3. weight.p3.true can be used with argument weights.phase3 of function caseCohortCoxSurvival, along with argument weights.phase3.type = "design".

weight.p3.est gives the estimated phase-three design weight. They were obtained from W3, phase2 and phase3. weight.p3.est can be used with argument weights.phase3 of function caseCohortCoxSurvival, along with argument weights.phase3.type = "estimated". If in function caseCohortCoxSurvival weights.phase3 = NULL but weights.phase3.type = "estimated", the phase-three design weights will be estimated from W3, phase2 and phase3 and should be identical.

weight.est gives the estimated design weight (i.e., product of the phase-two and estimated phase-three design weight).

References

Etievant, L., Gail, M. H. (2024). Cox model inference for relative hazard and pure risk from stratified weight-calibrated case-cohort data. Lifetime Data Analysis, 30, 572-599.

Etievant, L., Gail, M. H. (2024). Software Application Profile: CaseCohortCoxSurvival: an R package for case-cohort inference for relative hazard and pure risk under the Cox model. Submitted.

Examples


 data(dataexample.missingdata.stratified, package="CaseCohortCoxSurvival")

 # Display some of the data
 dataexample.missingdata.stratified$cohort[1:5, ]

[Package CaseCohortCoxSurvival version 0.0.36 Index]