oversample {mlr} | R Documentation |
Over- or undersample binary classification task to handle class imbalancy.
Description
Oversampling: For a given class (usually the smaller one) all existing observations are
taken and copied and extra observations are added by randomly sampling with replacement from this class.
Undersampling: For a given class (usually the larger one) the number of observations is
reduced (downsampled) by randomly sampling without replacement from this class.
Usage
oversample(task, rate, cl = NULL)
undersample(task, rate, cl = NULL)
Arguments
task |
(Task)
The task.
|
rate |
(numeric(1) )
Factor to upsample or downsample a class.
For undersampling: Must be between 0 and 1,
where 1 means no downsampling, 0.5 implies reduction to 50 percent
and 0 would imply reduction to 0 observations.
For oversampling: Must be between 1 and Inf ,
where 1 means no oversampling and 2 would mean doubling the class size.
|
cl |
(character(1) )
Which class should be over- or undersampled. If NULL , oversample
will select the smaller and undersample the larger class.
|
Value
Task.
See Also
Other imbalancy:
makeOverBaggingWrapper()
,
makeUndersampleWrapper()
,
smote()
[Package
mlr version 2.19.2
Index]