sa_diff {staccuracy} | R Documentation |
Statistical tests for the differences between standardized accuracies (staccuracies)
Description
Because the distribution of staccuracies is uncertain (and indeed, different staccuracies likely have different distributions), bootstrapping is used to empirically estimate the distributions and calculate the p-values. See the return value description for details on what the function provides.
Usage
sa_diff(
actual,
preds,
...,
na.rm = FALSE,
sa = NULL,
pct = c(0.01, 0.02, 0.03, 0.04, 0.05),
boot_alpha = 0.05,
boot_it = 1000,
seed = 0
)
Arguments
actual |
numeric vector. The actual (true) labels. |
preds |
named list of at least two numeric vectors. Each element is a vector of the same length as actual with predictions for each row corresponding to each element of actual. The names of the list elements should be the names of the models that produced each respective prediction; these names will be used to distinguish the results. |
... |
not used. Forces explicit naming of subsequent arguments. |
na.rm |
See documentation for |
sa |
list of functions. Each element is the unquoted name of a valid staccuracy function (see |
pct |
numeric with values from (0, 1). The percentage values on which the difference in staccuracies will be tested. |
boot_alpha |
numeric(1) from 0 to 1. Alpha for percentile-based confidence interval range for the bootstrapped means; the bootstrap confidence intervals will be the lowest and highest |
boot_it |
positive integer(1). The number of bootstrap iterations. |
seed |
integer(1). Random seed for the bootstrap sampling. Supply this between runs to assure identical results. |
Value
tibble with staccuracy difference results:
-
staccuracy
: name of staccuracy measure -
pred
,type
: Whentype
is 'pred', thepred
column gives named element in the inputpreds
. The row values give the staccuracy for that prediction. Whentype
is 'diff', thepred
column is of the form 'model1-model2', where 'model1' and 'model2' are names from the inputpreds
, which should be the names of each model that provided the predictions. The row values give the difference between staccuracies of model1 and model2. -
lo
,mean
,hi
: The lower bound, mean, and upper bound of the bootstrapped staccuracy. The lower and upper bounds are confidence intervals specified by the inputboot_alpha
. -
p__
: p-values that the staccuracies are at least the specified percentage difference or greater. E.g., for the default inputpct = c(0.01, 0.02, 0.03, 0.04, 0.05)
, these columns would bep01
,p02
,p03
,p04
, andp05
. As they apply only to differences between staccuracies, they areNA
for rows oftype
'pred'. As an example of their meaning, if themean
difference for 'model1-model2' is 0.0832 withp01
of 0.012 andp02
of 0.035, then it means that 1.2% of bootstrapped staccuracies had a difference of model1 - model2 less than 0.01 and 3.5% were less than 0.02. (That is, 98.8% of differences were greater than 0.01 and 96.5% were greater than 0.02.)
Examples
lm_attitude_all <- lm(rating ~ ., data = attitude)
lm_attitude__a <- lm(rating ~ . - advance, data = attitude)
lm_attitude__c <- lm(rating ~ . - complaints, data = attitude)
sdf <- sa_diff(
attitude$rating,
list(
all = predict(lm_attitude_all),
madv = predict(lm_attitude__a),
mcmp = predict(lm_attitude__c)
),
boot_it = 10
)
sdf