average_predicted {effectplots}R Documentation

Average Predictions

Description

Calculates average predictions over the values of one or multiple X variables. Shows the combined effect of a feature and other (correlated) features.

Usage

average_predicted(
  X,
  pred,
  w = NULL,
  x_name = "x",
  breaks = "Sturges",
  right = TRUE,
  discrete_m = 5L,
  outlier_iqr = 2,
  seed = NULL,
  ...
)

Arguments

X

A vector, matrix, or data.frame with variable(s) to be shown on the x axis.

pred

A numeric vector of predictions.

w

An optional numeric vector of weights.

x_name

If X is a vector: what is the name of the variable? By default "x".

breaks

An integer, vector, string or function specifying the bins of the numeric X variables as in graphics::hist(). The default is "Sturges". To allow varying values of breaks across variables, it can be a list of the same length as v, or a named list with breaks for certain variables.

right

Should bins be right-closed? The default is TRUE. Vectorized over v. Only relevant for numeric X.

discrete_m

Numeric X variables with up to this number of unique values should not be binned and treated as a factor (after calculating partial dependence) The default is 5. Vectorized over v.

outlier_iqr

Outliers of a numeric X are capped via the boxplot rule, i.e., outside outlier_iqr * IQR from the quartiles. The default is 2 is more conservative than the usual rule to account for right-skewed distributions. Set to 0 or Inf for no capping. Note that at most 10k observations are sampled to calculate quartiles. Vectorized over v.

seed

Optional random seed (an integer) used for capping X based on quantiles calculated from a subsample of 10k observations.

...

Currently unused.

Details

The function is a convenience wrapper around feature_effects().

Value

A list (of class "EffectData") with a data.frame of statistics per feature. Use single bracket subsetting to select part of the output.

References

Apley, Daniel W., and Jingyu Zhu. 2016. Visualizing the Effects of Predictor Variables in Black Box Supervised Learning Models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 82 (4): 1059–1086. doi:10.1111/rssb.12377.

See Also

feature_effects()

Examples

fit <- lm(Sepal.Length ~ ., data = iris)
M <- average_predicted(iris[2:5], pred = predict(fit, iris), breaks = 5)
M
M |> plot()

[Package effectplots version 0.1.0 Index]