simregions {MorphoRegions} | R Documentation |
Simulate regions data
Description
simregions()
simulates vertebrae and PCOs that satisfy certain constraints.
Usage
simregions(
nvert,
nregions,
nvar = 1,
r2 = 0.95,
minvert = 3,
cont = TRUE,
sl.dif = 0
)
## S3 method for class 'regions_sim'
plot(x, scores = 1, lines = TRUE, ...)
Arguments
nvert |
|
nregions |
|
nvar |
|
r2 |
|
minvert |
|
cont |
|
sl.dif |
|
x |
a |
scores |
|
lines |
|
... |
ignored. |
Details
simregions()
generates PCO scores for each requested vertebra such that certain conditions are met. The slopes for each region are drawn from a uniform distribution with limits of -.5 and .5. If a set of slopes contains two adjacent slopes that have a difference less than sl.dif
, it is rejected and a new one is drawn. The scaling of the PCOs is determined by the slopes and the requested R^2
. The PCOs will not necessarily be in order from most variable to least variable as they are in a traditional PCO analysis.
Intercepts (the intercept of the first region when cont = TRUE
and the intercept of all regions when cont = FALSE
) are drawn from a uniform distribution with limits of -n/4
and n/4
, where n
is the number of breakpoints, one less than nregions
. Intercepts other than the first when cont = TRUE
are determined by the slopes.
The cont
, r2
, and sl.dif
arguments control how easy it is for fitted segmented regression models to capture the true structure. When cont = TRUE
, it can be harder to determine exactly where regions begin and end, especially if sl.dif
is 0. When r2
is high, there is little variation around the true line, so the fitted lines will be more precise and region boundaries clearer. When sl.dif
is high, slopes of adjacent regions are different from each other, so it is easier to detect region boundaries. Setting sl.dif
to between .5 and 1 ensures that the slopes in adjacent regions have different signs.
Value
simregions()
returns a regions_sim
object, which contains the vertebra indices in the Xvar
entry, the PCO scores in the Yvar
entry, the simulated breakpoints in the BPs
entry, the simulated model coefficients in the coefs
entry, and the simulated error standard deviation in the ersd
entry. The attribute "design"
contains the design matrix, which when multiplied by the coefficients and added to a random normal variate with standard deviation equal to the error standard deviation yields the observed PCO scores.
plot()
returns a ggplot
object that can be manipulated using ggplot2
syntax. The plot is similar to that produced by plot.regions_pco()
and to that produced by plotsegreg()
except that the displayed lines (if requested) are the true rather than fitted regression lines.
See Also
calcregions()
for fitting segmented regression models to the simulated data; calcmodel()
for fitting a single segmented regression model to the simulated data; plotsegreg()
for plotting estimated regression lines.
Examples
# Simulate 40 vertebra, 4 regions (3 breakpoints), 3 PCOs,
# true model R2 of .9, continuous
set.seed(11)
sim <- simregions(nvert = 30, nregions = 4, nvar = 3, r2 = .95,
minvert = 3, cont = TRUE)
sim
# Plot the true data-generating lines and breakpoints
plot(sim, scores = 1:3)
# Run segmented regression models on simulated data,
# up to 6 regions
simresults <- calcregions(sim, scores = 1:3, noregions = 6,
minvert = 3, cont = TRUE,
verbose = FALSE)
summary(simresults)
# Select best model for each number of regions
(simmodels <- modelselect(simresults))
# Evaluate support for each model and rank models
(simsupp <- modelsupport(simmodels))
# AICc supports 3-4 regions
# Evaluate model performance of best model
modelperf(sim, modelsupport = simsupp,
criterion = "aic", model = 1)
# Second best model (3 regions) does quite well, too
modelperf(sim, modelsupport = simsupp,
criterion = "aic", model = 2)
#Plot best model fit
plotsegreg(sim, scores = 1:3,
modelsupport = simsupp,
criterion = "aic", model = 1)
# Calculate variability of estimate breakpoints for
# 3-region model; high uncertainty for breakpoints
# 1 and 2. Note weighted value for breakpoint 2
# differs from that of best model
bpvar <- calcBPvar(simresults, noregions = 4,
pct = .05, criterion = "aic")
bpvar
# Plot estimated vertebral map with variability
plotvertmap(sim, modelsupport = simsupp, model = 1,
criterion = "aic", text = TRUE)
# True map; pretty close
plotvertmap(sim, bps = c(3, 7, 24),
text = TRUE)