GEInfo {GEInfo} | R Documentation |
Realize to estimate the GEInfo approach at fixed tunings. It is available for Linear, Logistic, and Poisson regressions.
GEInfo(
E,
G,
Y,
family,
S_G,
S_GE,
kappa1,
kappa2,
lam1,
lam2,
tau,
xi = 6,
epsilon = 0,
max.it = 500,
thresh = 0.001,
Type_Y = NULL
)
E |
Observed matrix of E variables, of dimensions n x q. |
G |
Observed matrix of G variables, of dimensions n x p. |
Y |
Response variable, of length n. Quantitative for family="gaussian", or family="poisson" (non-negative counts). For family="binomial" should be a factor with two levels. |
family |
Model type: one of ("gaussian", "binomial", "poisson"). |
S_G |
A user supplied vector, denoting the subscript of G variables which have prior information. |
S_GE |
A user supplied matrix, denoting the subscript of G-E interactions which have prior information. The first and second columns of S_GE represent the subscript of G variable and the subscript of E variable, respectively. For example, S_GE = matrix( c(1, 2), ncol = 2), which indicates that the 1st G variable and the 2nd E variable have an interaction effect on Y. |
kappa1 |
A user supplied kappa1. |
kappa2 |
A user supplied kappa2. |
lam1 |
A user supplied lambda1. |
lam2 |
A user supplied lambda2. |
tau |
A user supplied tau. |
xi |
Tuning parameter of MCP penalty. Default is 6. |
epsilon |
Tuning parameter of Ridge penalty which shrinks on the coefficients having prior information. Default is 0. |
max.it |
Maximum number of iterations (total across entire path). Default is 500. |
thresh |
Convergence threshold for group coordinate descent algorithm. The algorithm iterates until the change for each coefficient is less than thresh. Default is 1e-3. |
Type_Y |
A vector of Type_Y prior information, having the same length with Y. Default is NULL. For family="gaussian", Type_Y is continuous. For family="binomial", Type_Y is binary. For family="poisson", Type_Y is a count vector. If users supply a Type_Y prior information, the function will use it to estimate a GEInfo model. If Type_Y=NULL, the function will incorporate the Type_S prior information S_G and S_GE to realize a GEInfo model. |
The function contains five tuning parameters, namely kappa1, kappa2, lambda1, lambda2, and tau. kappa1 and kappa2 are used to estimate model and select variables. lambda1 and lambda2 are used to calculate the prior-predicted response based on S_G and S_GE. tau is used for balancing between the observed response Y and the prior-predicted response.
An object of class "GEInfo" is returned, which is a list with the ingredients of the cross-validation fit.
a |
Coefficient vector of length q for E variables. |
b |
Coefficient vector of length (q+1)p for W (G variables and G-E interactions). |
beta |
Coefficient vector of length p for G variables. |
gamma |
Coefficient matrix of dimensions p*q for G-E interactions. |
alpha |
Intercept. |
coef |
A coefficient vector of length (q+1)*(p+1), including the estimates for |
Wang X, Xu Y, and Ma S. (2019). Identifying gene-environment interactions incorporating prior information. Statistics in medicine, 38(9): 1620-1633. doi: 10.1002/sim.8064
n <- 30; p <- 4; q <- 2
E <- MASS::mvrnorm(n, rep(0,q), diag(q))
G <- MASS::mvrnorm(n, rep(0,p), diag(p))
W <- matW(E, G)
alpha <- 0; a <- seq(0.4, 0.6, length=q);
beta <- c(seq(0.2, 0.5, length=2), rep(0, p-2))
vector.gamma <- c(0.8, 0.9, 0, 0)
gamma <- matrix(c(vector.gamma, rep(0, p*q - length(vector.gamma))), nrow=p, byrow=TRUE)
mat.b.gamma <- cbind(beta, gamma)
b <- as.vector(t(mat.b.gamma)) # coefficients of G and GE
Y <- alpha + E %*% a + W %*% b + rnorm (n, 0, 0.5)
S_G <- c(1)
S_GE <- cbind(c(1), c(1))
fit3 <- GEInfo(E, G, Y, family='gaussian', S_G=S_G,
S_GE=S_GE,kappa1 = 0.2,kappa2=0.2,lam1=0.2,lam2=0.2,tau=0.5)