get_folds {aifeducation}R Documentation

Create cross-validation samples

Description

Function creates cross-validation samples and ensures that the relative frequency for every category/label within a fold equals the relative frequency of the category/label within the initial data.

Usage

get_folds(target, k_folds)

Arguments

target

Named factor containing the relevant labels/categories. Missing cases should be declared with NA.

k_folds

int number of folds.

Value

Return a list with the following components:

Note

The parameter target allows cases with missing categories/labels. These should be declared with NA. All these cases are ignored for creating the different folds. Their names are saved within the component unlabeled_cases. These cases can be used for Pseudo Labeling.

the function checks the absolute frequencies of every category/label. If the absolute frequency is not sufficient to ensure at least four cases in every fold, the number of folds is adjusted. In these cases, a warning is printed to the console. At least four cases per fold are necessary to ensure that the training of TextEmbeddingClassifierNeuralNet works well with all options turned on.

See Also

Other Auxiliary Functions: array_to_matrix(), calc_standard_classification_measures(), check_embedding_models(), clean_pytorch_log_transformers(), create_iota2_mean_object(), create_synthetic_units(), generate_id(), get_coder_metrics(), get_n_chunks(), get_stratified_train_test_split(), get_synthetic_cases(), get_train_test_split(), is.null_or_na(), matrix_to_array_c(), split_labeled_unlabeled(), summarize_tracked_sustainability(), to_categorical_c()


[Package aifeducation version 0.3.3 Index]