gendata_cmgfm {CMGFM} | R Documentation |
Generate simulated data from covariate-augumented generalized factor model
gendata_cmgfm(
seed = 1,
n = 300,
pveclist = list(gaussian = c(50, 150), poisson = c(50), binomial = c(100, 60)),
q = 6,
d = 3,
rho = rep(1, length(pveclist)),
rho_z = 1,
sigmavec = rep(0.5, length(pveclist)),
n_bin = 1,
sigma_eps = 1,
seed.para = 1
)
seed |
a positive integer, the random seed for reproducibility of data generation process. |
n |
a positive integer, specify the sample size. |
pveclist |
a named list, specify the number of modalities for each variable type and dimension of variables in each modality. |
q |
a positive integer, specify the number of modality-shared factors. |
d |
a positive integer, specify the dimension of covariate matrix. |
rho |
a numeric vector with length |
rho_z |
a positive real, specify the signal strength of covariates. |
sigmavec |
a positive vector with length |
n_bin |
a positive integer, specify the number of trails in Binomial distribution. |
sigma_eps |
a positive real, the variance of overdispersion error. |
seed.para |
a positive integer, the random seed for reproducibility of data generation process by fixing the regression coefficient vector and loading matrices. |
None
return a list including the following components:
XList
- a list consisting of multiple matrices in which each matrix has the same type of values, i.e., continuous, or count, or binomial/binary values.
Z
- a matrix, the fixed-dimensional covariate matrix with control variables;
Alist
- the the offset vector for each modality;
B0list
- the true loading matrix for each modality;
mu0
- the true intercept vector for each modality;
U0
- the modality-specified factor vector;
F0
- the modality-shared factor matrix;
Uplist
- the true intercept-loading matrix for each modality;
beta
- the true regression coefficient vector for each modality;
sigma_eps
- the standard deviation of error term;
numvarmat
- a length(types)-by-d matrix, the number of variables in modalities that belong to the same type.
None
n <- 300;
pveclist = list('gaussian'=c(50, 150),'poisson'=c(50),'binomial'=c(100,60))
d <- 20; q <- 6;
datlist <- gendata_cmgfm(n=n, pveclist=pveclist, q=q, d=d)
str(datlist)