lmps {CIpostSelect} | R Documentation |
Function that handles storing our estimation and variable selection matrices during the different splits.
Description
Function that handles storing our estimation and variable selection matrices during the different splits.
Usage
lmps(
formula,
data,
method,
N,
p_split = 0.5,
cores = NULL,
direction = "backward",
forced_var = NULL
)
Arguments
formula |
Regression model to use, specified as a formula. |
data |
Data set to be used for regression modeling. |
method |
Method for variable selection. Should be one of |
N |
Number of splits. |
p_split |
Probabilities associated with the splits. |
cores |
Number of cores for parallel processing. |
direction |
It can take two values: |
forced_var |
A character string specifying a predictor variable to be forced into selection. By default, it is NULL, allowing for no forced selection. If provided, this variable will be consistently selected during the N splits. |
Details
We have data that we will split several times while shuffling it each time. Then, we will divide the data into two parts based on a specific probability for splitting. In the first half, we will perform model selection, followed by calibration on the second half. At the end of these steps, we will obtain matrices of dimensions N*p that represent the selected models and the estimated coefficients associated with these models.
Value
An object of class lmps
Examples
library(mlbench)
data("BostonHousing")
# lmps object
model = lmps(medv ~ ., data = BostonHousing, method = "Lasso", N = 50)
# A parallelized example
# lmps object
model = lmps(medv ~ ., data = BostonHousing, method = "Lasso", N = 50, cores = 2)