cross_validation {catalytic} | R Documentation |
Perform Cross-Validation for Model Estimation
Description
This function performs cross-validation for estimating risk over a sequence
of tuning parameters (tau_seq
) by fitting a Generalized Linear Model (GLM) to the data.
It evaluates model performance by splitting the dataset into multiple folds, training
the model on a subset of the data, and testing it on the remaining portion.
Usage
cross_validation(
formula,
cat_init,
tau_seq,
discrepancy_method,
cross_validation_fold_num,
...
)
Arguments
formula |
A formula specifying the GLMs. Should at least include response variables. |
cat_init |
A list generated from |
tau_seq |
A sequence of tuning parameter values ( |
discrepancy_method |
A function used to calculate the discrepancy (error) between model predictions and actual values. |
cross_validation_fold_num |
The number of folds to use in cross-validation. The dataset will be randomly split into this number of subsets, and the model will be trained and tested on different combinations of these subsets. |
... |
Other arguments passed to other internal functions. |
Details
-
Randomization of the Data: The data is randomly shuffled into
cross_validation_fold_num
subsets to ensure that the model is evaluated across different splits of the dataset. -
Model Training and Prediction: For each fold, a training set is used to fit a GLM with varying values of
tau
(fromtau_seq
), and the model is evaluated on a test set. The training data consists of both the observed and synthetic data, with synthetic data weighted bytau
. -
Risk Estimation: After fitting the model, the
discrepancy_method
is used to calculate the prediction error for each combination of fold andtau
. These errors are accumulated for eachtau
. -
Average Risk Estimate: After completing all folds, the accumulated prediction errors are averaged over the number of folds to provide a final risk estimate for each value of
tau
.
Value
A numeric vector of averaged risk estimates, one for each value of tau
in tau_seq
.