inv.prior.cov {BFI}R Documentation

Creates an inverse covariance matrix for a Gaussian prior

Description

inv.prior.cov builds a diagonal inverse covariance matrix for the Gaussian prior distribution based on the design matrix of covariates, that takes into account the number of regression parameters in case of categorical covariates. In case of a linear model, it also includes a row and column for the variance of the measurement errors.

Usage

inv.prior.cov(X, lambda = 1, L = 2, family = gaussian, intercept = TRUE,
              stratified = FALSE, strat_par = NULL, center_spec = NULL)

Arguments

X

design matrix of dimension n \times p, where n is the number of samples observed, and p is the number of predictors/variables so excluding the intercept.

lambda

the vector used as the diagonal of the (inverse covariance) matrix that will be created by inv.prior.cov(). The length of the vector depends on the number of columns of X, type of the covariates (continuous/dichotomous or categorical), family, whether an intercept is included in the model, and whether stratified analysis is desired. When stratified = FALSE, lambda could be a single positive number (if all values in the vector are equal), a vector of two elements (the first is used for regression parameters including “intercept” and the second for the “sigma2”), or a vector of length equal to the number of model parameters. However, the length of lambda is different when stratified = TRUE, see ‘Details’ for more information. Default is lambda = 1.

L

the number of centers. This argument is used only when stratified = TRUE. Default is L = 2. See ‘Details’ and ‘Examples’.

family

a description of the error distribution and link function used to specify the model. This can be a character string naming a family function or the result of a call to a family function (see family for details). In the current version, the family of model can be gaussian (with identity link function) and binomial (with logit link function). By default the gaussian family is used. In case of a linear regression model, family = gaussian, there is an extra model parameter for the variance of measurement error.

intercept

logical flag for having an intercept. By changing the intercept the dimension of the inverse covariance matrix changes. If intercept = TRUE (the default), the output matrix created by inv.prior.cov() has one row and one column related to intercept, while if intercept = FALSE, the resulting matrix does not have the row and column called intercept.

stratified

logical flag for performing the stratified analysis. If stratified = TRUE, the parameter(s) selected in the strat_par argument are allowed to be different across centers. This argument should only be used when designing the inverse covariance matrix for the (fictive) combined data, i.e., the last matrix for the Lambda argument in bfi(). If inv.prior.cov() is used for the analysis in the local centers (to built the L first matrices for the Lambda argument in bfi()), this argument should be FALSE, even if the BFI analysis is stratified. Default is stratified = FALSE. See ‘Details’ and ‘Examples’.

strat_par

a one- or two-element integer vector for indicating the stratification parameter(s). The values 1 and/or 2 are/is used to indicate that the “intercept” and/or “sigma2” are allowed to vary, respectively. This argument is used only when stratified = TRUE. Default is strat_par = NULL, but if stratified = TRUE, strat_par can not be NULL. For the binomial family the length of the vector should be one which refers to “intercept”, and the value of this element should be 1. For gaussian this vector can be 1 for indicating the “intercept” only, 2 for indicating the “sigma2” only, and c(1, 2) for both “intercept” and “sigma2”. See ‘Examples’.

center_spec

a vector of L elements for representing the center specific variable. This argument is used only when stratified = TRUE and strat_par = NULL. Each element represents a specific feature of the corresponding center. There must be only one specific value or attribute for each center. This vector could be a numeric, characteristic or factor vector. Note that, the order of the centers in the vector center_spec must be the same as in the list of the argument theta_hats in the function bfi(). The used data type in the argument center_spec must be categorical. Default is center_spec = NULL. See also ‘Details’ and ‘Examples’.

Details

inv.prior.cov creates a diagonal matrix with the vector lambda as its diagonal. The argument stratified = TRUE should only be used to construct a matrix for the prior density in case of stratification in the fictive combined data. Never be used for the construction of the matrix for analysis in the centers.

When stratified = FALSE, the length of the vector lambda depends on the covariate matrix X, family, and whether an “intercept” is included in the model. For example, if the design matrix X has p columns with continuous or dichotomous covariates, family = gaussian, and intercept = TRUE, then lambda should have p+2 elements. In this case, if in X there is a categorical covariate with q>2 categories, then the length of lambda increases with q-2. All values of lambda should be non-negative as they represent the inverse of the variance of the Gaussian prior. Note that, if all values in the vector lambda equal, one value is enough to be given as entry. If lambda is a scalar, the function inv.prior.cov sets each value at the diagonal equal to lambda. In the linear regression model the last parameter is assumed to be the inverse of the variance of the prior distribution for the measurement error. If lambda is two dimensional, the first value is used for the prior of the regression parameters and the second for the inverse of the variance of the prior distribution for the measurement error.

If stratified = TRUE the length of the vector lambda should be equal to the number of parameters in the combined model.

If intercept = FALSE, for the binomial family the stratified analysis is not possible therefore stratified can not be TRUE.

If stratified = FALSE, both strat_par and center_spec must be NULL (the defaults), while if stratified = TRUE only one of the two must be NULL.

The output of inv.prior.cov() can be used in the main functions MAP.estimation() and bfi().

Value

inv.prior.cov returns a diagonal matrix. The dimension of the matrix depends on the number of columns of X, type of the covariates (continuous/dichotomous or categorical), family, and intercept.

Author(s)

Hassan Pazira
Maintainer: Hassan Pazira hassan.pazira@radboudumc.nl

References

Jonker M.A., Pazira H. and Coolen A.C.C. (2024). Bayesian federated inference for estimating statistical models based on non-shared multicenter data sets, Statistics in Medicine, 1-18. <https://doi.org/10.1002/sim.10072>

See Also

MAP.estimation

Examples

#----------------
# Data Simulation
#----------------
X <- data.frame(x1=rnorm(50),                     # standard normal variable
                x2=sample(0:2, 50, replace=TRUE), # categorical variable
                x3=sample(0:1, 50, replace=TRUE)) # dichotomous variable
X$x2 <- as.factor(X$x2)
X$x3 <- as.factor(X$x3)

#---------------------
# Load the BFI package
#---------------------
library(BFI)

# The (inverse) variance value (lambda=0.05) is assumed to be
# the same for Gaussian prior of all parameters (for non-stratified)

#-------------------------------------------------
# Inverse Covariance Matrix for the Gaussian prior
#-------------------------------------------------
# y ~ Binomial with 'intercept'
inv.prior.cov(X, lambda=0.05, family=binomial) # returns a 5-by-5 matrix

# y ~ Binomial without 'intercept'
inv.prior.cov(X, lambda=0.05, family="binomial", intercept = FALSE) # a 4-by-4 matrix

# y ~ Gaussian with 'intercept'
inv.prior.cov(X, lambda=0.05, family=gaussian) # returns a 6-by-6 matrix

#--------------------
# Stratified analysis
#--------------------
# y ~ Binomial when 'intercept' varies across 3 centers:
inv.prior.cov(X, lambda=c(.2, 1), family=binomial, stratified=TRUE, strat_par = 1, L = 3)

# y ~ Gaussian when 'intercept' and 'sigma2' vary across 2 centers; y ~ Gaussian
inv.prior.cov(X, lambda=c(1, 2, 3), family=gaussian, stratified=TRUE, strat_par = c(1, 2))

# y ~ Gaussian when 'sigma2' varies across 2 centers (with 'intercept')
inv.prior.cov(X, lambda=c(1, 2, 3), family=gaussian, stratified=TRUE, strat_par = 2)

# y ~ Gaussian when 'sigma2' varies across 2 centers (without 'intercept')
inv.prior.cov(X, lambda=c(2, 3), family=gaussian, intercept = FALSE, stratified=TRUE,
              strat_par = 2)

#--------------------------
# Center specific covariate
#--------------------------
# center specific covariate has K=2 categories across 4 centers; y ~ Binomial
inv.prior.cov(X, lambda=c(0.1:2), family=binomial, stratified=TRUE,
              center_spec = c("Iran","Netherlands","Netherlands","Iran"), L=4)

# center specific covariate has K=3 categories across 5 centers; y ~ Gaussian
inv.prior.cov(X, lambda=c(0.5:3), family=gaussian, stratified=TRUE,
              center_spec = c("Medium","Big","Small","Big","Small"), L=5)

# center specific covariate has K=4 categories across 5 centers; y ~ Gaussian
inv.prior.cov(X, lambda=1, family=gaussian, stratified=TRUE, center_spec = c(3,1:4), L=5)


[Package BFI version 1.1.4 Index]