Dirac {kerntools} | R Documentation |
Kernels for categorical variables
Description
From a matrix or data.frame with dimension NxD, where N>1, D>0, 'Dirac()' computes the simplest kernel for categorical data. Samples should be in the rows and features in the columns. When there is a single feature, 'Dirac()' returns 1 if the category (or class, or level) is the same in two given samples, and 0 otherwise. Instead, when D>1, the results for the D features are combined doing a sum, a mean, or a weighted mean.
Usage
Dirac(X, comp = "mean", coeff = NULL, feat_space = FALSE)
Arguments
X |
Matrix (class "character") or data.frame (class "character", or columns = "factor"). The elements in X are assumed to be categorical in nature. |
comp |
When D>1, this argument indicates how the variables of the dataset are combined. Options are: "mean", "sum" and "weighted". (Defaults: "mean")
|
coeff |
(optional) A vector of weights with length D. |
feat_space |
If FALSE, only the kernel matrix is returned. Otherwise, the feature space is also returned. (Defaults: FALSE). |
Value
Kernel matrix (dimension: NxN), or a list with the kernel matrix and the feature space.
References
Belanche, L. A., and Villegas, M. A. (2013). Kernel functions for categorical variables with application to problems in the life sciences. Artificial Intelligence Research and Development (pp. 171-180). IOS Press. Link
Examples
# Categorical data
summary(CO2)
Kdirac <- Dirac(CO2[,1:3])
## Display a subset of the kernel matrix:
Kdirac[c(1,15,50,65),c(1,15,50,65)]