eskin {nomclust} | R Documentation |
A function for calculation of a proximity (dissimilarity) matrix based on the ES similarity measure.
eskin(data)
data |
A data.frame or a matrix with cases in rows and variables in colums. |
The Eskin similarity measure was proposed by Eskin et al. (2002) and examined by Boriah et al., (2008). It is constructed to assign higher weights to mismatches on variables with more categories.
The function returns an object of class "dist".
Zdenek Sulc.
Contact: zdenek.sulc@vse.cz
Boriah S., Chandola V., Kumar V. (2008). Similarity measures for categorical data: A comparative evaluation.
In: Proceedings of the 8th SIAM International Conference on Data Mining, SIAM, p. 243-254.
Eskin E., Arnold A., Prerau M., Portnoy L. and Stolfo S. (2002). A geometric framework for unsupervised anomaly detection.
In D. Barbara and S. Jajodia (Eds): Applications of Data Mining in Computer Security, p. 78-100. Norwell: Kluwer Academic Publishers.
good1
,
good2
,
good3
,
good4
,
iof
,
lin
,
lin1
,
of
,
sm
,
ve
,
vm
.
# sample data data(data20) # dissimilarity matrix calculation prox.eskin <- eskin(data20)