BMTrees_prediction {SBMTrees} | R Documentation |
Bayesian Trees Mixed-Effects Models for Predicting Longitudinal Outcomes
Description
Provides predictions for outcomes in longitudinal data using Bayesian Trees Mixed-Effects Models (BMTrees) and its semiparametric variants. The function predicts values for test data while accounting for random effects, complex relationships, and potential model misspecification.
Usage
BMTrees_prediction(
X_train,
Y_train,
Z_train,
subject_id_train,
X_test,
Z_test,
subject_id_test,
model = c("BMTrees", "BMTrees_R", "BMTrees_RE", "mixedBART"),
binary = FALSE,
nburn = 3000L,
npost = 4000L,
skip = 1L,
verbose = TRUE,
seed = NULL,
tol = 1e-20,
resample = 5,
ntrees = 200,
pi_CDP = 0.99
)
Arguments
X_train |
A matrix of covariates in the training set. |
Y_train |
A numeric or logical vector of outcomes in the training set. |
Z_train |
A matrix of random predictors in the training set. |
subject_id_train |
A character vector of subject IDs in the training set. |
X_test |
A matrix of covariates in the testing set. |
Z_test |
A matrix of random predictors in the testing set. |
subject_id_test |
A character vector of subject IDs in the testing set. |
model |
A character string specifying the predictive model. Options are |
binary |
Logical. Indicates whether the outcome is binary ( |
nburn |
An integer specifying the number of burn-in iterations for Gibbs sampler.
Default: |
npost |
An integer specifying the number of posterior samples to collect. Default: |
skip |
An integer indicating the thinning interval for MCMC samples. Default: |
verbose |
Logical. If |
seed |
An optional integer for setting the random seed to ensure reproducibility. Default: |
tol |
A numeric tolerance value to prevent numerical overflow and underflow in the model. Default: |
resample |
An integer specifying the number of resampling steps for the CDP prior. Default: |
ntrees |
An integer specifying the number of trees in BART. Default: |
pi_CDP |
A value between 0 and 1 for calculating the empirical prior in the CDP prior. Default: |
Value
A list containing posterior samples and predictions:
- post_tree_train
Posterior samples of the fixed-effects from BART on training data.
- post_Sigma
Posterior samples of covariance matrices in random effects.
- post_lambda_F
Posterior samples of lambda parameter in CDP normal mixture on random errors.
- post_lambda_G
Posterior samples of lambda parameter in CDP normal mixture on random-effects.
- post_B
Posterior samples of the coefficients in random effects.
- post_random_effect_train
Posterior samples of random effects for training data.
- post_sigma
Posterior samples of error deviation.
- post_expectation_y_train
Posterior expectations of training data outcomes, equal to fixed-effects + random effects.
- post_expectation_y_test
Posterior expectations of testing data outcomes, equal to fixed-effects + random effects.
- post_predictive_y_train
Posterior predictive distributions for training outcomes, equal to fixed-effects + random effects + predictive residual.
- post_predictive_y_test
Posterior predictive distributions for testing outcomes, equal to fixed-effects + random effects + predictive residual.
- post_eta
Posterior samples of location parameters in CDP normal mixture on random errors.
- post_mu
Posterior samples of location parameters in CDP normal mixture on random effects.
Note
This function utilizes modified C++ code originally derived from the BART3 package (Bayesian Additive Regression Trees). The original package was developed by Rodney Sparapani and is licensed under GPL-2. Modifications were made by Jungang Zou, 2024.
References
For more information about the original BART3 package, see: https://github.com/rsparapa/bnptools/tree/master/BART3
Examples
data = simulation_prediction(n_subject = 800, seed = 1234, nonlinear = TRUE,
nonrandeff = TRUE, nonresidual = TRUE)
model = BMTrees_prediction(data$X_train, data$Y_train, data$Z_train,
data$subject_id_train, data$X_test, data$Z_test, data$subject_id_test, model = "BMTrees",
binary = FALSE, nburn = 3000L, npost = 4000L, skip = 1L, verbose = TRUE, seed = 1234)
model$post_predictive_y_test
model$post_sigma