fastml {fastml} | R Documentation |
Fast Machine Learning Function
Description
Trains and evaluates multiple classification or regression models automatically detecting the task based on the target variable type.
Usage
fastml(
data,
label,
algorithms = "all",
test_size = 0.2,
resampling_method = "cv",
folds = ifelse(grepl("cv", resampling_method), 10, 25),
repeats = ifelse(resampling_method == "repeatedcv", 1, NA),
tune_params = NULL,
metric = NULL,
n_cores = 1,
stratify = NULL,
impute_method = "error",
encode_categoricals = TRUE,
scaling_methods = c("center", "scale"),
summaryFunction = NULL,
use_default_tuning = FALSE,
seed = 123
)
Arguments
data |
A data frame containing the features and target variable. |
label |
A string specifying the name of the target variable. |
algorithms |
A vector of algorithm names to use. Default is |
test_size |
A numeric value between 0 and 1 indicating the proportion of the data to use for testing. Default is |
resampling_method |
A string specifying the resampling method for cross-validation. Default is |
folds |
An integer specifying the number of folds for cross-validation. Default is |
repeats |
Number of times to repeat cross-validation (only applicable for methods like "repeatedcv"). |
tune_params |
A list specifying hyperparameter tuning ranges. Default is |
metric |
The performance metric to optimize during training. Default depends on the task. |
n_cores |
An integer specifying the number of CPU cores to use for parallel processing. Default is |
stratify |
Logical indicating whether to use stratified sampling when splitting the data. Default is |
impute_method |
Method for handling missing values. Options include:
Default is |
encode_categoricals |
Logical indicating whether to encode categorical variables. Default is |
scaling_methods |
Vector of scaling methods to apply. Default is |
summaryFunction |
A custom summary function for model evaluation. Default is |
use_default_tuning |
Logical indicating whether to use default tuning grids when |
seed |
An integer value specifying the random seed for reproducibility. |
Value
An object of class fastml_model
containing the best model, performance metrics, and other information.
Examples
# Example 1: Using the iris dataset for binary classification (excluding 'setosa')
data(iris)
iris <- iris[iris$Species != "setosa", ] # Binary classification
iris$Species <- factor(iris$Species)
# Train models
model <- fastml(
data = iris,
label = "Species",
algorithms = c("random_forest", "xgboost", "svm_radial")
)
# View model summary
summary(model)
# Example 2: Using the mtcars dataset for regression
data(mtcars)
# Train models
model <- fastml(
data = mtcars,
label = "mpg",
algorithms = c("random_forest", "xgboost", "svm_radial")
)
# View model summary
summary(model)