rapidsplit {rapidsplithalf} | R Documentation |
A very fast algorithm for computing stratified permutated split-half reliability.
rapidsplit(
data,
subjvar,
diffvars = NULL,
stratvars = NULL,
subscorevar = NULL,
aggvar,
splits = 6000,
aggfunc = c("means", "medians"),
errorhandling = list(type = c("none", "fixedpenalty"), errorvar = NULL, fixedpenalty =
600, blockvar = NULL),
standardize = FALSE,
include.scores = TRUE,
verbose = TRUE,
check = TRUE
)
## S3 method for class 'rapidsplit'
print(x, ...)
## S3 method for class 'rapidsplit'
plot(
x,
type = c("average", "minimum", "maximum", "random", "all"),
show.labels = TRUE,
...
)
rapidsplit.chunks(
data,
subjvar,
diffvars = NULL,
stratvars = NULL,
subscorevar = NULL,
aggvar,
splits = 6000,
aggfunc = c("means", "medians"),
errorhandling = list(type = c("none", "fixedpenalty"), errorvar = NULL, fixedpenalty =
600, blockvar = NULL),
standardize = FALSE,
include.scores = TRUE,
verbose = TRUE,
check = TRUE,
chunks = 4,
cluster = NULL
)
data |
Dataset, a |
subjvar |
Subject ID variable name, a |
diffvars |
Names of variables that determine which conditions
need to be subtracted from each other, |
stratvars |
Additional variables that the splits should
be stratified by; a |
subscorevar |
A |
aggvar |
Name of variable whose values to aggregate, a |
splits |
Number of split-halves to average, an |
aggfunc |
The function by which to aggregate the variable
defined in |
errorhandling |
A list with 4 named items, to be used to replace error trials
with the block mean of correct responses plus a fixed penalty, as in the IAT D-score.
The 4 items are |
standardize |
Whether to divide by scores by the subject's SD; a |
include.scores |
Include all individual split-half scores? |
verbose |
Display progress bars? Defaults to |
check |
Check input for possible problems? |
x |
|
... |
Ignored. |
type |
Character argument indicating what should be plotted.
By default, this plots the random split whose correlation is closest to the average.
However, this can also plot the random split with
the |
show.labels |
Should participant IDs be shown above their points in the scatterplot?
Defaults to |
chunks |
Number of chunks to divide the splits in, for more memory-efficient computation, and to divide over multiple cores if requested. |
cluster |
Chunks will be run on separate cores if a cluster is provided,
or an |
The order of operations (with optional steps between brackets) is:
Splitting
(Replacing error trials within block within split)
Computing aggregates per condition (per subscore) per person
Subtracting conditions from each other
(Dividing the resulting (sub)score by the SD of the data used to compute that (sub)score)
(Averaging subscores together into a single score per person)
Correlating scores from one half with scores from the other half
Computing the average split-half reliability using cormean()
Applying the Spearman-Brown formula to the absolute correlation
using spearmanBrown()
, and restoring the original sign after
A list
containing
r
: the averaged reliability.
allcors
: a vector with the reliability of each iteration.
nobs
: the number of participants.
scores
: the individual participants scores in each split-half,
contained in a list with two matrices (Only included if requested with include.scores
).
This function can use a lot of memory in one go.
If you are computing the reliability of a large dataset or you have little RAM,
it may pay off to use the sequential version of this function instead:
rapidsplit.chunks()
It is currently unclear it is better to pre-process your data before or after splitting it.
If you are computing the IAT D-score,
you can therefore use errorhandling
and standardize
to perform these two actions
after splitting, or you can process your data before splitting and forgo these two options.
data(foodAAT)
# Reliability of the double difference score:
# [RT(push food)-RT(pull food)] - [RT(push object)-RT(pull object)]
frel<-rapidsplit(data=foodAAT,
subjvar="subjectid",
diffvars=c("is_pull","is_target"),
stratvars="stimid",
aggvar="RT",
splits=100)
print(frel)
plot(frel,type="all")
# Compute a single random split-half reliability of the error rate
rapidsplit(data=foodAAT,
subjvar="subjectid",
aggvar="error",
splits=1,
aggfunc="means")
# Compute the reliability of an IAT D-score
data(raceIAT)
rapidsplit(data=raceIAT,
subjvar="session_id",
diffvars="congruent",
subscorevar="blocktype",
aggvar="latency",
errorhandling=list(type="fixedpenalty",errorvar="error",
fixedpenalty=600,blockvar="block_number"),
splits=100,
standardize=TRUE)
# Unstratified reliability of the median RT
rapidsplit.chunks(data=foodAAT,
subjvar="subjectid",
aggvar="RT",
splits=100,
aggfunc="medians",
chunks=8)
# Compute the reliability of Tukey's trimean of the RT
# on 2 CPU cores
trimean<-function(x){
sum(quantile(x,c(.25,.5,.75))*c(1,2,1))/4
}
rapidsplit.chunks(data=foodAAT,
subjvar="subjectid",
aggvar="RT",
splits=200,
aggfunc=trimean,
cluster=2)