fastml {fastml} | R Documentation |
Fast Machine Learning Function
Description
Trains and evaluates multiple classification models.
Usage
fastml(
data,
label,
algorithms = c("xgboost", "random_forest", "svm_radial"),
test_size = 0.2,
resampling_method = "cv",
folds = 5,
tune_params = NULL,
metric = "Accuracy",
n_cores = 1,
stratify = TRUE,
impute_method = NULL,
encode_categoricals = TRUE,
scaling_methods = c("center", "scale"),
summaryFunction = NULL,
seed = 123
)
Arguments
data |
A data frame containing the features and target variable. |
label |
A string specifying the name of the target variable. |
algorithms |
A vector of algorithm names to use. Default is |
test_size |
A numeric value between 0 and 1 indicating the proportion of the data to use for testing. Default is |
resampling_method |
A string specifying the resampling method for cross-validation. Default is |
folds |
An integer specifying the number of folds for cross-validation. Default is |
tune_params |
A list specifying hyperparameter tuning ranges. Default is |
metric |
The performance metric to optimize during training. Default is |
n_cores |
An integer specifying the number of CPU cores to use for parallel processing. Default is |
stratify |
Logical indicating whether to use stratified sampling when splitting the data. Default is |
impute_method |
Method for missing value imputation. Default is |
encode_categoricals |
Logical indicating whether to encode categorical variables. Default is |
scaling_methods |
Vector of scaling methods to apply. Default is |
summaryFunction |
A custom summary function for model evaluation. Default is |
seed |
An integer value specifying the random seed for reproducibility. |
Value
An object of class fastml_model
containing the best model, performance metrics, and other information.
Examples
# Example 1: Using the iris dataset for binary classification (excluding 'setosa')
data(iris)
iris <- iris[iris$Species != "setosa", ] # Binary classification
iris$Species <- factor(iris$Species)
# Train models
model <- fastml(
data = iris,
label = "Species"
)
# View model summary
summary(model)
# Example 2: Using the mtcars dataset for binary classification
data(mtcars)
mtcars$am <- factor(mtcars$am) # Convert transmission (0 = automatic, 1 = manual) to a factor
# Train models with a different resampling method and specific algorithms
model2 <- fastml(
data = mtcars,
label = "am",
algorithms = c("random_forest", "svm_radial"),
resampling_method = "repeatedcv",
folds = 3,
test_size = 0.25
)
# View model performance
summary(model2)
# Example 3: Using the airquality dataset with missing values
data(airquality)
airquality <- na.omit(airquality) # Simple example to remove missing values for demonstration
airquality$Month <- factor(airquality$Month)
# Train models with categorical encoding and scaling
model3 <- fastml(
data = airquality,
label = "Month",
encode_categoricals = TRUE,
scaling_methods = c("center", "scale")
)
# Evaluate and compare models
summary(model3)
# Example 4: Custom hyperparameter tuning for a random forest
data(iris)
iris <- iris[iris$Species != "setosa", ] # Filter out 'setosa' for binary classification
iris$Species <- factor(iris$Species)
custom_tuning <- list(
random_forest = expand.grid(mtry = c(1:10))
)
model4 <- fastml(
data = iris,
label = "Species",
algorithms = c("random_forest"),
tune_params = custom_tuning,
metric = "Accuracy"
)
# View the results
summary(model4)