lmps {CIpostSelect}R Documentation

Function that handles storing our estimation and variable selection matrices during the different splits.

Description

Function that handles storing our estimation and variable selection matrices during the different splits.

Usage

lmps(
  formula,
  data,
  method,
  N,
  p_split = 0.5,
  cores = NULL,
  direction = "backward",
  forced_var = NULL
)

Arguments

formula

Regression model to use, specified as a formula.

data

Data set to be used for regression modeling.

method

Method for variable selection. Should be one of "Lasso" or "BIC".

N

Number of splits.

p_split

Probabilities associated with the splits.

cores

Number of cores for parallel processing.

direction

It can take two values: "backward" and "forward". In the case of BIC, it specifies the direction in which the selection will be made.

forced_var

A character string specifying a predictor variable to be forced into selection. By default, it is NULL, allowing for no forced selection. If provided, this variable will be consistently selected during the N splits.

Details

We have data that we will split several times while shuffling it each time. Then, we will divide the data into two parts based on a specific probability for splitting. In the first half, we will perform model selection, followed by calibration on the second half. At the end of these steps, we will obtain matrices of dimensions N*p that represent the selected models and the estimated coefficients associated with these models.

Value

An object of class lmps

Examples


library(mlbench)
data("BostonHousing")
# lmps object
model = lmps(medv ~ ., data = BostonHousing, method = "Lasso", N = 50)


# A parallelized example
# lmps object
model = lmps(medv ~ ., data = BostonHousing, method = "Lasso", N = 50, cores = 2)



[Package CIpostSelect version 0.2.1 Index]