dc_CA {douconca} | R Documentation |
Double constrained correspondence analysis (dc-CA) for analyzing
(multi-)trait (multi-)environment ecological data using library vegan
and native R code. It has a formula
interface which allows to assess,
for example, the importance of trait interactions in shaping ecological
communities. The function dc_CA
has an option to divide the abundance
data of a site by the site total, giving equal site weights. This division
has the advantage that the multivariate analysis corresponds with an
unweighted (multi-trait) community-level analysis, instead of being weighted
(Kleyer et al. 2012).
dc_CA(
formulaEnv = NULL,
formulaTraits = NULL,
response = NULL,
dataEnv = NULL,
dataTraits = NULL,
divideBySiteTotals = TRUE,
dc_CA_object = NULL,
verbose = TRUE
)
formulaEnv |
formula or one-sided formula for the rows (samples) with
row predictors in |
formulaTraits |
formula or one-sided formula for the columns (species)
with column predictors in |
response |
matrix, data frame of the abundance data
(dimension n x m) or list with community weighted means (CWMs)
from |
dataEnv |
matrix or data frame of the row predictors, with rows
corresponding to those in |
dataTraits |
matrix or data frame of the column predictors, with rows
corresponding to the columns in |
divideBySiteTotals |
logical; default |
dc_CA_object |
optional object from an earlier run of this function.
Useful if the same formula for the columns ( |
verbose |
logical for printing a simple summary (default: TRUE) |
Empty (all zero) rows and columns in response
are removed from the
response
and the corresponding rows from dataEnv
and
dataTraits
. Subsequently, any columns with missing values are
removed from dataEnv
and dataTraits
. It gives an error
('name_of_variable' not found), if variables with missing entries are
specified in formulaEnv
and formulaTraits
.
Computationally, dc-CA can be carried out by a single singular value
decomposition (ter Braak et al. 2018), but it is here computed in two steps.
In the first step, the transpose of the response
is regressed on to
the traits (the column predictors) using cca
with
formulaTraits
. The column scores of this analysis (in scaling 1) are
community weighted means (CWM) of the orthonormalized traits. These are then
regressed on the environmental (row) predictors using wrda
with formulaEnv
or using rda
, if site weights
are equal.
A dc-CA can be carried out on, what statisticians call, the sufficient
statistics of the method. This is useful, when the abundance data are not
available or could not be made public in a paper attempting reproducible
research. In this case, response
should be a list
with as first element community weighted means (CWMs) with respect to the
traits, and the trait data, and, optionally, further elements, for functions
related to dc_CA
. The minimum is a
list(CWM, weight = list(columns = species_weights))
with CWM a matrix
or data.frame, but then formulaEnv
, formulaTraits
,
dataEnv
, dataTraits
must be specified in the call to
dc_CA
. The function fCWM_SNC
and its example
show how to set the
response
for this and helps to create the response
from
abundance data in these non-standard applications of dc-CA. Species and site
weights, if not set in response$weights
can be set by a variable
weight
in the data frames dataTraits
and dataEnv
,
respectively, but formulas should then not be ~.
.
The statistics and scores in the example dune_dcCA.r
, have been
checked against the results in Canoco 5.15 (ter Braak & Šmilauer, 2018).
A list of class
dcca
; that is a list with elements
a cca.object
from the
cca
analysis of the transpose of the closed
response
using formula formulaTraits
.
the argument formulaTraits
. If the formula was
~.
, it was changed to explicit trait names.
a list of Y
, dataEnv
and dataTraits
,
after removing empty rows and columns in response
and after closure if
divideBySiteTotals = TRUE
and with the corresponding rows in
dataEnv
and dataTraits
removed.
a list of unit-sum weights of row and columns. The names of
the list are c("row", "columns")
, in that order.
number of sites (rows).
Community weighted means w.r.t. orthonormalized traits.
a wrda
object or
cca.object
from the
wrda
or, if with equal row weights,
rda
analysis, respectively of the column scores of the
cca
, which are the CWMs of orthonormalized traits, using formula
formulaEnv
.
the argument formulaEnv
. If the formula was
~.
, it was changed to explicit environmental variable names.
the dc-CA eigenvalues (same as those of the
rda
analysis).
mean, sd, VIF and (regression) coefficients of
the traits that define the dc-CA axes in terms of the
traits with t-ratios missing indicated by NA
s for 'tval1'.
a one-column matrix with four inertias (weighted variances):
total: the total inertia.
conditionT: the inertia explained by the condition in
formulaTraits
if present (neglecting row constraints).
traits_explain: the inertia explained by the traits (neglecting the
row predictors and any condition in formulaTraits
). This is the
maximum that the row predictors could explain in dc-CA (the sum of the
following two items is thus less than this value).
conditionE: the trait-constrained inertia explained by the condition
in formulaEnv
.
constraintsTE: the trait-constrained inertia explained by the predictors (without the row covariates).
If verbose
is TRUE
(or after out <- print(out)
is
invoked) there are three more items.
c_traits_normed
: mean, sd, VIF and (regression) coefficients of
the traits that define the dc-CA trait axes (composite traits), and their
optimistic t-ratio.
c_env_normed
: mean, sd, VIF and (regression) coefficients of
the environmental variables that define the dc-CA axes in terms of the
environmental variables (composite gradients), and their optimistic t-ratio.
species_axes
: a list with four items
species_scores
: a list with names
c("species_scores_unconstrained", "lc_traits_scores")
with the
matrix with species niche centroids along the dc-CA axes (composite
gradients) and the matrix with linear combinations of traits.
correlation
: a matrix with inter-set correlations of the
traits with their SNCs.
b_se
: a matrix with (unstandardized) regression coefficients
for traits and their optimistic standard errors.
R2_traits
: a vector with coefficient of determination (R2)
of the SNCs on to the traits. The square-root thereof could be called
the species-trait correlation in analogy with the species-environment
correlation in CCA.
sites_axes
: a list with four items
site_scores
: a list with names
c("site_scores_unconstrained", "lc_env_scores")
with the matrix
with community weighted means (CWMs) along the dc-CA axes (composite
gradients) and the matrix with linear combinations of environmental
variables.
correlation
: a matrix with inter-set correlations of the
environmental variables with their CWMs.
b_se
: a matrix with (unstandardized) regression coefficients
for environmental variables and their optimistic standard errors.
R2_env
: a vector with coefficient of determination (R2) of
the CWMs on to the environmental variables. The square-root thereof
has been called the species-environmental correlation in CCA.
All scores in the dcca
object are in scaling "sites"
(1):
the scaling with Focus on Case distances .
Kleyer, M., Dray, S., Bello, F., Lepš, J., Pakeman, R.J., Strauss, B., Thuiller, W. & Lavorel, S. (2012) Assessing species and community functional responses to environmental gradients: which multivariate methods? Journal of Vegetation Science, 23, 805-821. doi:10.1111/j.1654-1103.2012.01402.x
ter Braak, CJF, Šmilauer P, and Dray S. 2018. Algorithms and biplots for double constrained correspondence analysis. Environmental and Ecological Statistics, 25(2), 171-197. doi:10.1007/s10651-017-0395-x
ter Braak C.J.F. and P. Šmilauer (2018). Canoco reference manual and user's guide: software for ordination (version 5.1x). Microcomputer Power, Ithaca, USA, 536 pp.
Oksanen, J., et al. (2024) vegan: Community Ecology Package. R package version 2.6-6.1. https://CRAN.R-project.org/package=vegan.
plot.dcca
, scores.dcca
,
print.dcca
and anova.dcca
data("dune_trait_env")
# rownames are carried forward in results
rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites
mod <- dc_CA(formulaEnv = ~A1 + Moist + Mag + Use + Manure,
formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan,
response = dune_trait_env$comm[, -1], # must delete "Sites"
dataEnv = dune_trait_env$envir,
dataTraits = dune_trait_env$traits)
anova(mod, by = "axis")
# For more demo on testing, see demo dune_test.r
mod_scores <- scores(mod)
# correlation of axes with a variable that is not in the model
scores(mod, display = "cor", scaling = "sym", which_cor = list(NULL, "X_lot"))
cat("head of unconstrained site scores, with meaning\n")
print(head(mod_scores$sites))
mod_scores_tidy <- scores(mod, tidy = TRUE)
print("names of the tidy scores")
print(names(mod_scores_tidy))
cat("\nThe levels of the tidy scores\n")
print(levels(mod_scores_tidy$score))
cat("\nFor illustration: a dc-CA model with a trait covariate\n")
mod2 <- dc_CA(formulaEnv = ~ A1 + Moist + Mag + Use + Manure,
formulaTraits = ~ SLA + Height + LDMC + Lifespan + Condition(Seedmass),
response = dune_trait_env$comm[, -1], # must delete "Sites"
dataEnv = dune_trait_env$envir,
dataTraits = dune_trait_env$traits)
cat("\nFor illustration: a dc-CA model with both environmental and trait covariates\n")
mod3 <- dc_CA(formulaEnv = ~A1 + Moist + Use + Manure + Condition(Mag),
formulaTraits = ~ SLA + Height + LDMC + Lifespan + Condition(Seedmass),
response = dune_trait_env$comm[, -1], # must delete "Sites"
dataEnv = dune_trait_env$envir,
dataTraits = dune_trait_env$traits, verbose = FALSE)
cat("\nFor illustration: same model but using dc_CA_object = mod2 for speed, ",
"as the trait model and data did not change\n")
mod3B <- dc_CA(formulaEnv = ~A1 + Moist + Use + Manure + Condition(Mag),
dataEnv = dune_trait_env$envir,
dc_CA_object = mod2, verbose= FALSE)
cat("\ncheck on equality of mod3 (from data) and mod3B (from a dc_CA_object)\n",
"the expected difference is in the component 'call'\n ")
print(all.equal(mod3, mod3B)) # only the component call differs