ale {effectplots} | R Documentation |
Calculates ALE for one or multiple X
variables.
The concept of ALE was introduced in Apley et al. (2020) as an alternative to partial dependence (PD). The Ceteris Paribus clause behind PD is a blessing and a curse at the same time:
Blessing: The interpretation is easy and similar to what we know from linear regression (just averaging out interaction effects).
Curse: The model is applied to very unlikely or even impossible feature combinations, especially with strongly dependent features.
ALE fixes the curse as follows: Partial dependence is calculated for the lower and
upper endpoint of a bin, using all (or a sample) of observations falling into this
bin. Its slope provides the local effect over the bin.
This is repeated for all bins, and the values are accumulated. Since the resulting
sum starts at 0, one typically shifts the result vertically, e.g., to the average
prediction. This is not done by ale()
, however.
The function is a convenience wrapper around feature_effects()
, which calls
the barebone implementation .ale()
to calculate ALE. The ALE values calculated
by feature_effects()
are vertically shifted to the same (weighted) average than the
partial dependence curve, for optimal comparability.
ale(object, ...)
## Default S3 method:
ale(
object,
v,
data,
pred_fun = stats::predict,
trafo = NULL,
which_pred = NULL,
w = NULL,
breaks = "Sturges",
right = TRUE,
discrete_m = 5L,
outlier_iqr = 2,
ale_n = 50000L,
ale_bin_size = 200L,
seed = NULL,
...
)
## S3 method for class 'ranger'
ale(
object,
v,
data,
pred_fun = NULL,
trafo = NULL,
which_pred = NULL,
w = NULL,
breaks = "Sturges",
right = TRUE,
discrete_m = 5L,
outlier_iqr = 2,
ale_n = 50000L,
ale_bin_size = 200L,
seed = NULL,
...
)
## S3 method for class 'explainer'
ale(
object,
v = colnames(data),
data = object$data,
pred_fun = object$predict_function,
trafo = NULL,
which_pred = NULL,
w = object$weights,
breaks = "Sturges",
right = TRUE,
discrete_m = 5L,
outlier_iqr = 2,
ale_n = 50000L,
ale_bin_size = 200L,
seed = NULL,
...
)
object |
Fitted model. |
... |
Further arguments passed to |
v |
Vector of variable names to calculate statistics. |
data |
Matrix or data.frame. |
pred_fun |
Prediction function, by default |
trafo |
How should predictions be transformed?
A function or |
which_pred |
If the predictions are multivariate: which column to pick
(integer or column name). By default |
w |
Optional vector with case weights. Can also be a column name in |
breaks |
An integer, vector, string or function specifying the bins
of the numeric X variables as in |
right |
Should bins be right-closed? The default is |
discrete_m |
Numeric X variables with up to this number of unique values
should not be binned and treated as a factor (after calculating partial dependence)
The default is 5. Vectorized over |
outlier_iqr |
Outliers of a numeric X are capped via the boxplot rule, i.e.,
outside |
ale_n |
Size of the data used for calculating ALE.
The default is 50000. For larger |
ale_bin_size |
Maximal number of observations used per bin for ALE calculations.
If there are more observations in a bin, |
seed |
Optional random seed (an integer) used for:
|
A list (of class "EffectData") with a data.frame of statistics per feature. Use single bracket subsetting to select part of the output.
ale(default)
: Default method.
ale(ranger)
: Default method.
ale(explainer)
: Default method.
Apley, Daniel W., and Jingyu Zhu. 2020. Visualizing the Effects of Predictor Variables in Black Box Supervised Learning Models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 82 (4): 1059–1086. doi:10.1111/rssb.12377.
fit <- lm(Sepal.Length ~ ., data = iris)
M <- ale(fit, v = "Petal.Length", data = iris)
M |> plot()
M2 <- ale(fit, v = colnames(iris)[-1], data = iris, breaks = 5)
plot(M2, share_y = "all") # Only numeric variables shown