tsbf_zzz2023 {HDNRA} | R Documentation |
Zhang et al. (2023)'s test for testing equality of two-sample high-dimensional mean vectors without assuming that two covariance matrices are the same.
tsbf_zzz2023(y1, y2, cutoff)
y1 |
The data matrix (p by n1) from the first population. Each column represents a |
y2 |
The data matrix (p by n2) from the first population. Each column represents a |
cutoff |
An empirical criterion for applying the adjustment coefficient |
Suppose we have two independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma}_i,i=1,2.
The primary object is to test
H_{0}: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2\; \operatorname{versus}\; H_{1}: \boldsymbol{\mu}_1 \neq \boldsymbol{\mu}_2.
Zhang et al.(2023) proposed the following test statistic:
T_{ZZZ}=\frac{n_1 n_2}{np}(\bar{\boldsymbol{y}}_1-\bar{\boldsymbol{y}}_2)^{\top} \hat{\boldsymbol{D}}_n^{-1}(\bar{\boldsymbol{y}}_1-\bar{\boldsymbol{y}}_2),
where \bar{\boldsymbol{y}}_{i},i=1,2
are the sample mean vectors, and \hat{\boldsymbol{D}}_n=\operatorname{diag}(\hat{\boldsymbol{\Sigma}}_1/n+\hat{\boldsymbol{\Sigma}}_2/n)
with n=n_1+n_2
.
They showed that under the null hypothesis, T_{ZZZ}
and a chi-squared-type mixture have the same limiting distribution.
A (list) object of S3
class htest
containing the following elements:
the p-value of the test proposed by Zhang et al. (2023)'s test.
the test statistic proposed by Zhang et al. (2023)'s test.
estimated approximate degrees of freedom of Zhang et al. (2023)'s test.
the adjustment coefficient used in Zhang et al. (2023)'s test.
Zhang L, Zhu T, Zhang J (2023). “Two-sample Behrens–Fisher problems for high-dimensional data: a normal reference scale-invariant test.” Journal of Applied Statistics, 50(3), 456–476. doi:10.1080/02664763.2020.1834516.
set.seed(1234)
n1 <- 20
n2 <- 30
p <- 50
mu1 <- t(t(rep(0, p)))
mu2 <- mu1
rho1 <- 0.1
rho2 <- 0.2
a1 <- 1
a2 <- 2
w1 <- (-2 * sqrt(a1 * (1 - rho1)) + sqrt(4 * a1 * (1 - rho1) + 4 * p * a1 * rho1)) / (2 * p)
x1 <- w1 + sqrt(a1 * (1 - rho1))
Gamma1 <- matrix(rep(w1, p * p), nrow = p)
diag(Gamma1) <- rep(x1, p)
w2 <- (-2 * sqrt(a2 * (1 - rho2)) + sqrt(4 * a2 * (1 - rho2) + 4 * p * a2 * rho2)) / (2 * p)
x2 <- w2 + sqrt(a2 * (1 - rho2))
Gamma2 <- matrix(rep(w2, p * p), nrow = p)
diag(Gamma2) <- rep(x2, p)
Z1 <- matrix(rnorm(n1*p,mean = 0,sd = 1), p, n1)
Z2 <- matrix(rnorm(n2*p,mean = 0,sd = 1), p, n2)
y1 <- Gamma1 %*% Z1 + mu1%*%(rep(1,n1))
y2 <- Gamma2 %*% Z2 + mu2%*%(rep(1,n2))
tsbf_zzz2023(y1,y2,cutoff=1.2)