good2 {nomclust} | R Documentation |
A function for calculation of a proximity (dissimilarity) matrix based on the G2 similarity measure.
good2(data)
data |
A data.frame or a matrix with cases in rows and variables in colums. |
The Goodall 2 similarity measure was presented in (Boriah et al., 2008). It is a simple modification of the original Goodall measure (Goodall, 1966). The measure assigns weight to infrequent matches under the condition that there are also other categories, which are even less frequent than the examined one.
The function returns an object of class "dist".
Zdenek Sulc.
Contact: zdenek.sulc@vse.cz
Boriah S., Chandola V., Kumar V. (2008). Similarity measures for categorical data: A comparative evaluation.
In: Proceedings of the 8th SIAM International Conference on Data Mining, SIAM, p. 243-254.
Goodall V.D. (1966). A new similarity index based on probability. Biometrics, 22(4), p. 882.
eskin
,
good1
,
good3
,
good4
,
iof
,
lin
,
lin1
,
of
,
sm
,
ve
,
vm
.
# sample data data(data20) # dissimilarity matrix calculation prox.good2 <- good2(data20)