concordance {survival} | R Documentation |
The concordance statistic compute the agreement between an observed response and a predictor. It is closely related to Kendall's tau-a and tau-b, Goodman's gamma, and Somers' d, all of which can also be calculated from the results of this function.
concordance(object, ...)
## S3 method for class 'formula'
concordance(object, data, weights, subset, na.action,
cluster, ymin, ymax, timewt= c("n", "S", "S/G", "n/G", "n/G2", "I"),
influence=0, ranks = FALSE, reverse=FALSE, timefix=TRUE, keepstrata=10, ...)
## S3 method for class 'lm'
concordance(object, ..., newdata, cluster, ymin, ymax,
influence=0, ranks=FALSE, timefix=TRUE, keepstrata=10)
## S3 method for class 'coxph'
concordance(object, ..., newdata, cluster, ymin, ymax,
timewt= c("n", "S", "S/G", "n/G", "n/G2", "I"), influence=0,
ranks=FALSE, timefix=TRUE, keepstrata=10)
## S3 method for class 'survreg'
concordance(object, ..., newdata, cluster, ymin, ymax,
timewt= c("n", "S", "S/G", "n/G", "n/G2", "I"), influence=0,
ranks=FALSE, timefix=TRUE, keepstrata=10)
object |
a fitted model or a formula. The formula should be of
the form |
data |
a data.frame in which to interpret the variables named in
the |
weights |
optional vector of case weights.
Only applicable if |
subset |
expression indicating which subset of the rows of data should be used in
the fit. Only applicable if |
na.action |
a missing-data filter function. This is applied to the model.frame
after any subset argument has been used. Default is
|
... |
multiple fitted models are allowed. Only applicable if
|
newdata |
optional, a new data frame in which to evaluate (but not refit) the models |
cluster |
optional grouping vector for calculating the robust variance |
ymin, ymax |
compute the concordance over the restricted range ymin <= y <= ymax. (For survival data this is a time range.) |
timewt |
the weighting to be applied. The overall statistic is a weighted mean over event times. |
influence |
1= return the dfbeta vector, 2= return the full influence matrix, 3 = return both |
ranks |
if TRUE, return a data frame containing the individual ranks that make up the overall score. |
reverse |
if TRUE then assume that larger |
timefix |
correct for possible rounding error. See the vignette on tied times for more explanation. Essentially, exact ties are an important part of the concordance computatation, but "exact" can be a subtle issue with floating point numbers. |
keepstrata |
either TRUE, FALSE, or an integer value.
Computations are always done within stratum, then added. If the
total number of strata greater than |
At each event time, compute the rank of the subject who had the
event as compared to all others with a longer survival, where the
rank is value between 0 and 1. The concordance is a weighted mean
of these values, determined by the timewt
option.
For uncensored data each unique response value is compared to all
those which are larger.
Using the default value for timewt
gives the area
under the receiver operating curve (AUC) for a binary response,
and (d+1)/2 when y is continuous, where d is Somers' d.
For a survival time, timewt
of n gives Harrell's c-statistic,
which is closely related to the Gehan-Wilcoxon test,
S corresponds to the Peto-Wilcoxon, n/G2 is the weighted advocated
by Umo, and S/G the weighting proposed by Schemper.
When the number of strata is very large, such as in a conditional
logistic regression for instance (clogit
function), a much
faster computation is available when the individual strata results
are not retained; use keepstrata=FALSE
or keepstrata=0
to do so. In the general case the keepstrata = 10
default simply keeps the printout managable: it retains and prints
per-strata information if the number of strata is <= 10.
An object of class concordance
containing the following
components:
concordance |
the estimated concordance value or values |
count |
a vector containing the number of concordant pairs, discordant, tied on x but not y, tied on y but not x, and tied on both x and y |
n |
the number of observations |
var |
a vector containing the estimated variance of the concordance based on the infinitesimal jackknife (IJ) method. If there are multiple models it contains the estimtated variance/covariance matrix. |
cvar |
a vector containing the estimated variance(s) of the
concordance values, based on the variance formula for the associated
score test from a proportional hazards model. (This was the primary
variance used in the |
dfbeta |
optional, the vector of leverage estimates for the concordance |
influence |
optional, the matrix of leverage values for each of the counts, one row per observation |
ranks |
optional, a data frame containing the Somers' d rank at each event time, along with the time weight, case weight of the observation with an event, and variance (contribution to the proportional hazards model information matrix). A weighted mean of the ranks equals Somer's d. |
A coxph model that has a numeric failure may have undefined predicted values, in which case the concordance will be NULL.
Computation for an existing coxph model along with newdata
has
some subtleties with respect to extra arguments in the original call.
These include
tt() terms in the model. This is not supported with newdata.
subset. Any subset clause in the original call is ignored, i.e., not applied to the new data.
strata() terms in the model. The new data is expected to have the strata variable(s) found in the original data set, with concordance computed within strata. The levels of the strata variable need not be the same as in the original data.
id or cluster directives. This has not yet been sorted out.
Terry Therneau
fit1 <- coxph(Surv(ptime, pstat) ~ age + sex + mspike, mgus2)
concordance(fit1, timewt="n")
# logistic regression
fit2 <- glm(pstat ~ age + sex + mspike, binomial, data= mgus2)
concordance(fit2) # equal to the AUC