vif_filter {scorecardModelUtils} | R Documentation |
Removing multicollinearity from a model using vif test
Description
The function takes a dataset with the starting variables and target only. The vif is calculated and if the maximum vif value is more than the threshold, the variable is dropped from the model and the vif's are recomputed. These steps of computing vif and dropping variable keep iterating till the maximum vif value is less than or equal to the threshold.
Usage
vif_filter(base, target, threshold = 2)
Arguments
base |
input dataframe with set of final variables only along with target |
target |
column / field name for the target variable to be passed as string (must be 0/1 type) |
threshold |
threshold value for vif (default value is 2) |
Value
An object of class "vif_filter" is a list containing the following components:
vif_table |
vif table post vif filtering |
model |
the model used for vif calculation |
retain_var_list |
variables remaining in the model post vif filter as an array |
dropped_var_list |
variables dropped from the model in vif filter step |
threshold |
threshold |
Author(s)
Arya Poddar <aryapoddar290990@gmail.com>
Examples
data <- iris
suppressWarnings(RNGversion('3.5.0'))
set.seed(11)
data$Y <- sample(0:1,size=nrow(data),replace=TRUE)
vif_data_list <- vif_filter(base = data,target = "Y")
vif_data_list$vif_table
vif_data_list$model
vif_data_list$retain_var_list
vif_data_list$dropped_var_list
vif_data_list$threshold