calcregions {MorphoRegions} | R Documentation |
Fit segmented regression models for all combinations of breakpoints
Description
calcregions()
enumerates all possible combinations of breakpoints to fit multivariate segmented regression models. addregions()
adds models with additional numbers of regions to the resulting output object. ncombos()
computes an upper bound on the number of breakpoint combinations that will be tested.
Usage
calcregions(
pco,
scores,
noregions,
minvert = 3,
cont = TRUE,
exhaus = TRUE,
includebp = NULL,
omitbp = NULL,
ncombos_file_trigger = 1e+07,
temp_file_dir = tempdir(TRUE),
cl = NULL,
verbose = TRUE
)
addregions(
regions_results,
noregions,
exhaus = TRUE,
ncombos_file_trigger = 1e+07,
temp_file_dir = tempdir(TRUE),
cl = NULL,
verbose = TRUE
)
## S3 method for class 'regions_results'
summary(object, ...)
ncombos(pco, noregions, minvert = 3, includebp = NULL, omitbp = NULL)
Arguments
pco |
a |
scores |
|
noregions |
|
minvert |
|
cont |
|
exhaus |
|
includebp |
an optional vector of vertebrae that must be included in any tested set of breakpoints, e.g., if it is known that two regions are divided at that vertebra. |
omitbp |
an optional vector of vertebrae to be omitted from the list of possible breakpoints, e.g., if it is known that two adjacent vertebrae belong to the same region. |
ncombos_file_trigger |
|
temp_file_dir |
string; the directory where the temporary files will be saved (and then deleted) when the number of breakpoint combinations exceeds |
cl |
a cluster object created by |
verbose |
|
regions_results , object |
a |
... |
ignored. |
Details
calcregions()
enumerates all possible combinations of breakpoints that satisfy the constraint imposed by minvert
(i.e., that breakpoints need to be at least minvert
vertebrae apart) and fits the segmented regression models implied by each combination. These are multivariate regression models with the PCO scores specified by scores
as the outcomes. When cont = TRUE
, these regression models are continuous; i.e., the regression lines for each region connect at the breakpoints. Otherwise, the models are discontinuous so that each region has its own intercept and slope. The models are fit using .lm.fit()
, which efficiently implements ordinary least squares regression.
When exhaus = FALSE
, heuristics are used to reduce the number of models to fit, which can be useful for keeping the size of the resulting object down by avoiding fitting models corresponding to breakpoint combinations that yield a poor fit to the data. Only breakpoint combinations that correspond to the breakpoints of the best fitting model with a smaller number of regions +/- 3 vertebrae are used, and only models that have an RSS smaller than half a standard deviation more the smallest RSS are kept.
addregions()
should be used on an existing regions_results
object to add models with more regions. Internally, it works just the same as calcregions()
.
ncomobs()
computes an upper bound on the number of possible breakpoint combinations. When exhaus = FALSE
or includebp
is specified, the actual number of combinations will be smaller than that produced by ncombos()
.
When the number of possible combinations of breakpoints for a given number of regions (as computed by ncombos()
) is larger than ncombos_file_trigger
, the problem will be split into smaller problems, with the results of each stored in temporary files that are deleted when the function completes. These temporary files will be stored in the directory supplied to temp_file_dir
. By default, this is the temporary directory produced by tempdir()
. However, this directory can be deleted by R at any time without warning, which will cause the function to crash, so it is a good idea to supply your own directory that will be preserved. You can use ncombos()
to check to see if the number of breakpoint combinations exceeds ncombos_file_trigger
.
Value
A regions_results
object with the following components:
-
results
- the results of the fitting process for each combination of breakpoints -
stats
- statistics summarizing the fitting process. Usesummary()
to view this information in a clean format.
ncombos()
returns a numeric vector with the number of breakpoint combinations for each number of regions (which are stored as the names).
See Also
calcmodel()
to fit a segmented regression model for a single set of breakpoints; modelselect()
to select the best model for each number of regions based on RSS; modelsupport()
to compute statistics the describe the support of the best models; calcBPvar()
to compute the variability in the optimal breakpoints.
Examples
data("alligator")
alligator_data <- process_measurements(alligator,
pos = "Vertebra")
# Compute PCOs
alligator_PCO <- svdPCO(alligator_data)
# Fit segmented regression models for 1 to 5 regions
# using PCOs 1 to 4 and a continuous model with a
# non-exhaustive search
regionresults <- calcregions(alligator_PCO,
scores = 1:4,
noregions = 5,
minvert = 3,
cont = TRUE,
exhaus = FALSE,
verbose = FALSE)
regionresults
# View model fitting summary
summary(regionresults)
# Add additional regions to existing results,
# exhaustive search this time
regionresults <- addregions(regionresults,
noregions = 6:7,
exhaus = TRUE,
verbose = FALSE)
regionresults
summary(regionresults)
# Fit segmented regression models for 1 to 5 regions
# using PCOs 1 to 4 and a discontinuous model with a
# exhaustive search, excluding breakpoints at vertebrae
# 10 and 15
regionresults <- calcregions(alligator_PCO,
scores = 1:4,
noregions = 5,
minvert = 3,
cont = FALSE,
omitbp = c(10, 15),
verbose = FALSE)
regionresults
summary(regionresults)
# Compute the number of breakpoint combinations for given
# specification using `ncombos()`; if any number exceeds
# the value supplied to `ncombos_file_trigger`, results
# will temporary be stored in files before being read in and
# deleted.
ncombos(alligator_PCO,
noregions = 1:8,
minvert = 3)