tsbf_zz2022 {HDNRA} | R Documentation |
Zhang and Zhu (2022)'s test for testing equality of two-sample high-dimensional mean vectors without assuming that two covariance matrices are the same.
tsbf_zz2022(y1, y2)
y1 |
The data matrix (p by n1) from the first population. Each column represents a |
y2 |
The data matrix (p by n2) from the first population. Each column represents a |
Suppose we have two independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma}_i,i=1,2.
The primary object is to test
H_{0}: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2\; \operatorname{versus}\; H_{1}: \boldsymbol{\mu}_1 \neq \boldsymbol{\mu}_2.
Zhang and Zhu (2022) proposed the following test statistic:
T_{ZZ} = \|\bar{\boldsymbol{y}}_1 - \bar{\boldsymbol{y}}_2\|^2-\operatorname{tr}(\hat{\boldsymbol{\Omega}}_n),
where \bar{\boldsymbol{y}}_{i},i=1,2
are the sample mean vectors and \hat{\boldsymbol{\Omega}}_n
is the estimator of \operatorname{Cov}(\bar{\boldsymbol{y}}_1-\bar{\boldsymbol{y}}_2)
.
They showed that under the null hypothesis, T_{ZZ}
and a chi-squared-type mixture have the same normal or non-normal limiting distribution.
A (list) object of S3
class htest
containing the following elements:
the p-value of the test proposed by Zhang and Zhu (2022).
the test statistic proposed by Zhang and Zhu (2022).
parameter used in Zhang and Zhu (2022)'s test.
parameter used in Zhang and Zhu (2022)'s test.
estimated approximate degrees of freedom of Zhang and Zhu (2022)'s test.
Zhang J, Zhu T (2022). “A further study on Chen-Qin’s test for two-sample Behrens–Fisher problems for high-dimensional data.” Journal of Statistical Theory and Practice, 16(1), 1. doi:10.1007/s42519-021-00232-w.
set.seed(1234)
n1 <- 20
n2 <- 30
p <- 50
mu1 <- t(t(rep(0, p)))
mu2 <- mu1
rho1 <- 0.1
rho2 <- 0.2
a1 <- 1
a2 <- 2
w1 <- (-2 * sqrt(a1 * (1 - rho1)) + sqrt(4 * a1 * (1 - rho1) + 4 * p * a1 * rho1)) / (2 * p)
x1 <- w1 + sqrt(a1 * (1 - rho1))
Gamma1 <- matrix(rep(w1, p * p), nrow = p)
diag(Gamma1) <- rep(x1, p)
w2 <- (-2 * sqrt(a2 * (1 - rho2)) + sqrt(4 * a2 * (1 - rho2) + 4 * p * a2 * rho2)) / (2 * p)
x2 <- w2 + sqrt(a2 * (1 - rho2))
Gamma2 <- matrix(rep(w2, p * p), nrow = p)
diag(Gamma2) <- rep(x2, p)
Z1 <- matrix(rnorm(n1 * p, mean = 0, sd = 1), p, n1)
Z2 <- matrix(rnorm(n2 * p, mean = 0, sd = 1), p, n2)
y1 <- Gamma1 %*% Z1 + mu1 %*% (rep(1, n1))
y2 <- Gamma2 %*% Z2 + mu2 %*% (rep(1, n2))
tsbf_zz2022(y1, y2)