MMGFM {MMGFM} | R Documentation |
Fit the high-dimensional multi-study multi-modality covariate-augmented generalized factor model via variational inference.
MMGFM(
XList,
ZList,
numvarmat,
tauList = NULL,
q = 15,
qsvec = rep(2, length(XList)),
init = c("MSFRVI", "random", "LFM"),
epsELBO = 1e-12,
maxIter = 30,
verbose = TRUE,
seed = 1
)
XList |
a S-length list with each component a m-length list composed by a combined modality matrix of the same type modalities, which is the observed matrix from each source/study and each modality, where m is the number of modality types. |
ZList |
a S-length list with each component a matrix that is the covariate matrix from each study. |
numvarmat |
a m-by-T matrix with rownames modality types that specifies the variable number for each modality of each modality type, where m is the number of modality types, T is the maximum number of modalities for one of modality types . |
tauList |
an optional S-length list with each component a m-length list correponding the offset term for each combined modality of each study; default as full-zero matrix. |
q |
an optional string, specify the number of study-shared factors; default as 15. |
qsvec |
a integer vector with length S, specify the number of study-specifed factors; default as 2. |
init |
an optional string, specify the initialization method, supporting "MSFRVI", "random" and "LFM", default as "MSFRVI". |
epsELBO |
an optional positive vlaue, tolerance of relative variation rate of the envidence lower bound value, defualt as '1e-5'. |
maxIter |
the maximum iteration of the VEM algorithm. The default is 30. |
verbose |
a logical value, whether output the information in iteration. |
seed |
an optional integer, specify the random seed for reproducibility in initialization. |
If init="MSFRVI"
, it will use the results from multi-study linear factor model in MultiCOAP package as initial values; If init="LFM"
, it will use the results from linear factor model by combing data from all studies as initials.
return a list including the following components:
hbeta
- a M-length list composed by the estimated regression coefficient matrix for each modality;
hA
- a M-length list composed by the loading matrix corresponding to study-shared factors for each modality;
hB
- a S-length list composed by a M-length loading matrix list corresponding to study-specified factors for each study;
hF
- a S-length list composed by the posterior estimation of study-shared factor matrix for each study;
hH
- a S-length list composed by the posterior estimation of study-specified factor matrix for each study;
hSigma
- a S-length list composed by the estimated posterior variance of the study-shared factor;
hPhi
- a S-length list composed by the estimated posterior variance of study-specified factor;
hv
- a S-length list composed by a M-length vector list corresponding to the posterior estimation of study-specified and modality variable-shared factor for each study and modality;
hzeta
- the estimated posterior variance for study-specified and modality variable-shared factor;
hsigma2
- the estimated variance for study-specified and modality variable-shared factor;
hinvLambda
- a S-length list composed by a M-length vector list corresponding to the inverse of the estimated variances of error;
S
- the approximated posterior covariance for each row of F;
ELBO
- the ELBO value when algorithm stops;
ELBO_seq
- the sequence of ELBO values.
time_use
- the running time in model fitting of SpaCOAP;
None
None
q <- 3; qsvec<-rep(2,3)
nvec <- c(100, 120, 100)
pveclist <- list('gaussian'=rep(150, 1),'poisson'=rep(50, 2),'binomial'=rep(60, 2))
datlist <- gendata_mmgfm(seed = 1, nvec = nvec, pveclist =pveclist,
q = q, d= 3,qs = qsvec, rho = rep(3,length(pveclist)), rho_z=0.5,
sigmavec=rep(0.5, length(pveclist)), sigma_eps=1)
XList <- datlist$XList
ZList <- datlist$ZList
numvarmat <- datlist$numvarmat
### For illustration, we set maxIter=3. Set maxIter=50 when running formally
reslist1 <- MMGFM(XList, ZList=ZList, numvarmat, q=q, qsvec = qsvec, init='MSFRVI',maxIter = 3)
str(reslist1)