lmps {CIpostSelect} | R Documentation |
Function that handles storing our estimation and variable selection matrices during the different splits.
lmps(
formula,
data,
method,
N,
p_split = 0.5,
cores = NULL,
direction = "backward",
forced_var = NULL
)
formula |
Regression model to use, specified as a formula. |
data |
Data set to be used for regression modeling. |
method |
Method for variable selection. Should be one of |
N |
Number of splits. |
p_split |
Probabilities associated with the splits. |
cores |
Number of cores for parallel processing. |
direction |
It can take two values: |
forced_var |
A character string specifying a predictor variable to be forced into selection. By default, it is NULL, allowing for no forced selection. If provided, this variable will be consistently selected during the N splits. |
We have data that we will split several times while shuffling it each time. Then, we will divide the data into two parts based on a specific probability for splitting. In the first half, we will perform model selection, followed by calibration on the second half. At the end of these steps, we will obtain matrices of dimensions N*p that represent the selected models and the estimated coefficients associated with these models.
An object of class lmps
library(mlbench)
data("BostonHousing")
# lmps object
model = lmps(medv ~ ., data = BostonHousing, method = "Lasso", N = 50)
# A parallelized example
# lmps object
model = lmps(medv ~ ., data = BostonHousing, method = "Lasso", N = 50, cores = 2)