kNN.Mahalanobis {sicure}R Documentation

K Nearest Neighbors with Mahalanobis Distance

Description

This function computes the k nearest neighbors for a given set of data points, where each observation is a pair of the form (X, T), with X representing a covariate and T the observed time. The distance between each pair of points is computed using the Mahalanobis distance:

d_M((X_i, T_i), (X_j, T_j)) = \sqrt{ \left( \begin{pmatrix} X_i \\ T_i \end{pmatrix} - \begin{pmatrix} X_j \\ T_j \end{pmatrix} \right)^t \Sigma^{-1} \left( \begin{pmatrix} X_i \\ T_i \end{pmatrix} - \begin{pmatrix} X_j \\ T_j \end{pmatrix} \right) },

where \Sigma is the variance-covariance matrix of the joint distribution of (X, T).

Usage

kNN.Mahalanobis(x, time, k)

Arguments

x

A numeric vector of length n giving the covariate values.

time

A numeric vector giving the observed times.

k

The number of nearest neighbors to search.

Value

A matrix with n rows and k columns. Each row represents each pair (X_i, T_i). The values in each row give the index of the k nearest neighbors considering Mahalanobis distance.

References

Mahalanobis, P. C. (1936). On the generalised distance in statistics. Proceedings of the National Institute of Sciences of India, 2, 49-55.

Examples

# Some artificial data
set.seed(123)
n <- 50
x <- runif(n, -2, 2) # Covariate values
y <- rweibull(n, shape = 0.5 * (x + 4)) # True lifetimes
c <- rexp(n) # Censoring values
p <- exp(2*x)/(1 + exp(2*x)) # Probability of being susceptible
u <- runif(n)
t  <- ifelse(u < p, pmin(y, c), c) # Observed times
d  <- ifelse(u < p, ifelse(y < c, 1, 0), 0) # Uncensoring indicator
kNN.Mahalanobis(x=x, time=t, k=5)

[Package sicure version 0.1.0 Index]