standardize_coxph {stdReg2} | R Documentation |
standardize_coxph
performs regression standardization in Cox proportional
hazards models at specified values of the exposure over the sample
covariate distribution. Let T
, X
, and Z
be the survival
outcome, the exposure, and a vector of covariates, respectively.
standardize_coxph
fits a Cox proportional hazards model and the Breslow estimator
of the baseline hazard in order to estimate the
standardized survival function \theta(t,x)=E\{S(t|X=x,Z)\}
when measure = "survival"
or the standardized restricted mean survival up to time t \theta(t, x) = E\{\int_0^t S(u|X = x, Z) du\}
when measure = "rmean"
, where
t
is a specific value of T
, x
is a specific value of
X
, and the expectation is over the marginal distribution of Z
.
standardize_coxph(
formula,
data,
values,
times,
measure = c("survival", "rmean"),
clusterid,
ci_level = 0.95,
ci_type = "plain",
contrasts = NULL,
family = "gaussian",
reference = NULL,
transforms = NULL
)
formula |
The formula which is used to fit the model for the outcome. |
data |
The data. |
values |
A named list or data.frame specifying the variables and values at which marginal means of the outcome will be estimated. |
times |
A vector containing the specific values of |
measure |
Either "survival" to estimate the survival function at times or "rmean" for the restricted mean survival up to the largest of times. |
clusterid |
An optional string containing the name of a cluster identification variable when data are clustered. |
ci_level |
Coverage probability of confidence intervals. |
ci_type |
A string, indicating the type of confidence intervals. Either "plain", which gives untransformed intervals, or "log", which gives log-transformed intervals. |
contrasts |
A vector of contrasts in the following format:
If set to |
family |
The family argument which is used to fit the glm model for the outcome. |
reference |
A vector of reference levels in the following format:
If |
transforms |
A vector of transforms in the following format:
If set to |
standardize_coxph
fits the Cox proportional hazards model
\lambda(t|X,Z)=\lambda_0(t)\exp\{h(X,Z;\beta)\}.
Breslow's estimator of the cumulative baseline hazard
\Lambda_0(t)=\int_0^t\lambda_0(u)du
is used together with the partial
likelihood estimate of \beta
to obtain estimates of the survival
function S(t|X=x,Z)
if measure = "survival"
:
\hat{S}(t|X=x,Z)=\exp[-\hat{\Lambda}_0(t)\exp\{h(X=x,Z;\hat{\beta})\}].
For each t
in the t
argument and for each x
in the
x
argument, these estimates are averaged across all subjects (i.e.
all observed values of Z
) to produce estimates
\hat{\theta}(t,x)=\sum_{i=1}^n \hat{S}(t|X=x,Z_i)/n,
where Z_i
is the value of Z
for subject i
, i=1,...,n
. The variance
for \hat{\theta}(t,x)
is obtained by the sandwich formula.
If measure = "rmean"
, then \Lambda_0(t)=\int_0^t\lambda_0(u)du
is used together with the partial
likelihood estimate of \beta
to obtain estimates of the restricted mean survival
up to time t: \int_0^t S(u|X=x,Z) du
for each element of times
. The estimation
and inference is done using the method described in Chen and Tsiatis 2001.
Currently, we can only estimate the difference in RMST for a single binary
exposure. Two separate Cox models are fit for each level of the exposure,
which is expected to be coded as 0/1.
An object of class std_surv
.
This is basically a list with components estimates and covariance estimates in res
Results for transformations, contrasts, references are stored in res_contrasts
.
The output contains estimates for contrasts and confidence intervals for all
combinations of transforms and reference levels.
Obtain numeric results in a data frame with the tidy function.
Standardized survival functions are sometimes referred to as (direct) adjusted survival functions in the literature.
standardize_coxph/standardize_parfrailty
does not currently handle time-varying exposures or
covariates.
standardize_coxph/standardize_parfrailty
internally loops over all values in the t
argument.
Therefore, the function will usually be considerably faster if
length(t)
is small.
The variance calculation performed by standardize_coxph
does not condition on
the observed covariates \bar{Z}=(Z_1,...,Z_n)
. To see how this
matters, note that
var\{\hat{\theta}(t,x)\}=E[var\{\hat{\theta}(t,x)|\bar{Z}\}]+var[E\{\hat{\theta}(t,x)|\bar{Z}\}].
The usual parameter \beta
in a Cox proportional hazards model does not
depend on \bar{Z}
. Thus, E(\hat{\beta}|\bar{Z})
is independent
of \bar{Z}
as well (since E(\hat{\beta}|\bar{Z})=\beta
), so that
the term var[E\{\hat{\beta}|\bar{Z}\}]
in the corresponding variance
decomposition for var(\hat{\beta})
becomes equal to 0. However,
\theta(t,x)
depends on \bar{Z}
through the average over the
sample distribution for Z
, and thus the term
var[E\{\hat{\theta}(t,x)|\bar{Z}\}]
is not 0, unless one conditions on
\bar{Z}
. The variance calculation by Gail and Byar (1986) ignores this
term, and thus effectively conditions on \bar{Z}
.
Arvid Sjölander, Adam Brand, Michael Sachs
Chang I.M., Gelman G., Pagano M. (1982). Corrected group prognostic curves and summary statistics. Journal of Chronic Diseases 35, 669-674.
Gail M.H. and Byar D.P. (1986). Variance calculations for direct adjusted survival curves, with applications to testing for no treatment effect. Biometrical Journal 28(5), 587-599.
Makuch R.W. (1982). Adjusted survival curve estimation using covariates. Journal of Chronic Diseases 35, 437-443.
Sjölander A. (2016). Regression standardization with the R-package stdReg. European Journal of Epidemiology 31(6), 563-574.
Sjölander A. (2018). Estimation of causal effect measures with the R-package stdReg. European Journal of Epidemiology 33(9), 847-858.
Chen, P. Y., Tsiatis, A. A. (2001). Causal inference on the difference of the restricted mean lifetime between two groups. Biometrics, 57(4), 1030-1038.
require(survival)
set.seed(7)
n <- 300
Z <- rnorm(n)
Zbin <- rbinom(n, 1, .3)
X <- rnorm(n, mean = Z)
T <- rexp(n, rate = exp(X + Z + X * Z)) # survival time
C <- rexp(n, rate = exp(X + Z + X * Z)) # censoring time
fact <- factor(sample(letters[1:3], n, replace = TRUE))
U <- pmin(T, C) # time at risk
D <- as.numeric(T < C) # event indicator
dd <- data.frame(Z, Zbin, X, U, D, fact)
fit.std.surv <- standardize_coxph(
formula = Surv(U, D) ~ X + Z + X * Z,
data = dd,
values = list(X = seq(-1, 1, 0.5)),
times = 1:5
)
print(fit.std.surv)
plot(fit.std.surv)
tidy(fit.std.surv)
fit.std <- standardize_coxph(
formula = Surv(U, D) ~ X + Zbin + X * Zbin + fact,
data = dd,
values = list(Zbin = 0:1),
times = 1.5,
measure = "rmean",
contrast = "difference",
reference = 0
)
print(fit.std)
tidy(fit.std)