glhtbf_zz2022 {HDNRA} | R Documentation |
Zhang and Zhu (2022)'s test for general linear hypothesis testing (GLHT) problem for high-dimensional data under heteroscedasticity.
glhtbf_zz2022(Y,G,n,p)
Y |
A list of |
G |
A known full-rank coefficient matrix ( |
n |
A vector of |
p |
The dimension of data. |
Suppose we have the following k
independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma}_i,i=1,\ldots,k.
It is of interest to test the following GLHT problem:
H_0: \boldsymbol{G M}=\boldsymbol{0}, \quad \text { vs. } H_1: \boldsymbol{G M} \neq \boldsymbol{0},
where
\boldsymbol{M}=(\boldsymbol{\mu}_1,\ldots,\boldsymbol{\mu}_k)^\top
is a k\times p
matrix collecting k
mean vectors and \boldsymbol{G}:q\times k
is a known full-rank coefficient matrix with \operatorname{rank}(\boldsymbol{G})<k
.
Let \bar{\boldsymbol{y}}_{i},i=1,\ldots,k
be the sample mean vectors and \hat{\boldsymbol{\Sigma}}_i,i=1,\ldots,k
be the sample covariance matrices.
Zhang and Zhu (2022) proposed the following U-statistic based test statistic:
T_{ZZ}=\|\boldsymbol{C \hat{\mu}}\|^2-\sum_{i=1}^kh_{ii}\operatorname{tr}(\hat{\boldsymbol{\Sigma}}_i)/n_i,
where \boldsymbol{C}=[(\boldsymbol{G D G}^\top)^{-1/2}\boldsymbol{G}]\otimes\boldsymbol{I}_p
, \boldsymbol{D}=\operatorname{diag}(1/n_1,\ldots,1/n_k)
, and h_{ij}
is the (i,j)
th entry of the k\times k
matrix \boldsymbol{H}=\boldsymbol{G}^\top(\boldsymbol{G}\boldsymbol{D}\boldsymbol{G}^\top)^{-1}\boldsymbol{G}
.
A (list) object of S3
class htest
containing the following elements:
the p
-value of the test proposed by Zhang and Zhu (2022).
the test statistic proposed by Zhang and Zhu (2022).
the parameter used in Zhang and Zhu (2022)'s test.
the parameter used in Zhang and Zhu (2022)'s test.
estimated approximate degrees of freedom of Zhang and Zhu (2022)'s test.
Zhang J, Zhu T (2022). “A new normal reference test for linear hypothesis testing in high-dimensional heteroscedastic one-way MANOVA.” Computational Statistics & Data Analysis, 168, 107385. doi:10.1016/j.csda.2021.107385.
set.seed(1234)
k <- 3
p <- 50
n <- c(25, 30, 40)
rho <- 0.1
M <- matrix(rep(0, k * p), nrow = k, ncol = p)
avec <- seq(1, k)
Y <- list()
for (g in 1:k) {
a <- avec[g]
y <- (-2 * sqrt(a * (1 - rho)) + sqrt(4 * a * (1 - rho) + 4 * p * a * rho)) / (2 * p)
x <- y + sqrt(a * (1 - rho))
Gamma <- matrix(rep(y, p * p), nrow = p)
diag(Gamma) <- rep(x, p)
Z <- matrix(rnorm(n[g] * p, mean = 0, sd = 1), p, n[g])
Y[[g]] <- Gamma %*% Z + t(t(M[g, ])) %*% (rep(1, n[g]))
}
G <- cbind(diag(k - 1), rep(-1, k - 1))
glhtbf_zz2022(Y, G, n, p)