hcsvd {bdsvd} | R Documentation |
Hierarchical Variable Clustering Using Singular Vectors (HC-SVD).
Description
Performs HC-SVD to reveal the hierarchical variable structure as descried in Bauer (202X). For this divise approach, each cluster is split into two clusters iteratively. Potential splits are identified by the first sparse loadings (which are sparse approximations of the first right eigenvectors, i.e., vectors with many zero values, of the correlation matrix) that mirror the masked shape of the correlation matrix. This procedure is continued until each variable lies in a single cluster.
Usage
hcsvd(
R,
q = "Kaiser",
linkage = "average",
is.corr = TRUE,
max.iter,
trace = TRUE
)
Arguments
R |
A correlation matrix of dimension |
q |
Number of sparse loadings to be used. This should be either a numeric value between zero and one to indicate percentages, or |
linkage |
The linkage function to be used. This should be one of |
is.corr |
Is the supplied object a correlation matrix. Default is |
max.iter |
How many iterations should be performed for computing the sparse loadings.
Default is |
trace |
Print out progress as |
Details
The sparse loadings are computed using the method of Shen and Huang (2008), which is implemented based on the code
of Baglama, Reichel, and Lewis in ssvd
{irlba}, with slight modifications to suit our method.
Value
A list with four components:
hclust |
The clustering structure identified by HC-SVD as an object of type |
dist.matrix |
The ultrametric distance matrix (cophenetic matrix) of the HC-SVD structure as an object of class |
u.cor |
The ultrametric correlation matrix of |
q.p |
A vector of length |
References
Bauer, J.O. (202X). Divisive hierarchical clustering identified by singular vectors.
Shen, H. and Huang, J.Z. (2008). Sparse principal component analysis via regularized low rank matrix approximation, J. Multivar. Anal. 99, 1015–1034.
Examples
#We replicate the simulation study (a) in Bauer (202X)
## Not run:
p <- 40
n <- 500
b <- 5
design <- "a"
set.seed(1)
Rho <- hcsvd.cor.sim(p = p, b = b, design = "a")
X <- mvtnorm::rmvnorm(n, mean=rep(0, p), sigma = Rho, checkSymmetry = FALSE)
R <- cor(X)
hcsvd.obj <- hcsvd(R)
#The object of hclust with corresponding dendrogram can be obtained
#directly from hcsvd.obj$hclust:
hc <- hcsvd.obj$hclust
plot(hc)
#The dendrogram can also be obtained from the ultrametric distance matrix:
plot(hclust(hcsvd.obj$dist.matrix))
## End(Not run)