cluster_means {adproclus} | R Documentation |
Cluster Means based on Original Variables
Description
Obtain a cluster-by-variable dataframe where the values are the cluster means
for the given variables. Takes as input a (low dimensional) ADPROCLUS model
of class adpc
and a dataset. This dataset must have the same number
of rows as the cluster membership matrix $A$ of the model. The variables can
be different from the ones the model was trained on. The function uses the
cluster membership matrix of the model to computer per cluster the mean of
the variables in the dataset. In the output matrix of cluster means,
the last row Cl0
corresponds to the baseline cluster consisting
of all the observations that were not assigned to a cluster,
if this cluster is not empty. This function effectively computes column means
of the dataset separately for each cluster.
Usage
cluster_means(data, model, digits = 3)
Arguments
data |
Object-by-variable matrix. Can contain other variables than the ADPROCLUS model. IMPORTANT: The number of rows must be equal to the number of observations in the ADPROCLUS model. |
model |
ADPROCLUS solution (class: |
digits |
Integer. The number of decimal places that all decimal numbers will be rounded to. |
Details
It is worth noting that the output of this function is different
from the last output matrix in the
summary()
method applied to an ADPROCLUS model.
The former computes the means over the original variable values
while the latter computes them over the approximated model variable values.
Value
Cluster-by-variable dataframe where the values are the cluster means for the given variable.
Examples
# Obtain data, compute model, report cluster means
x <- CGdata
model <- adproclus(x, 3)
cluster_means(data = x, model = model)