dc_CA {douconca} | R Documentation |
Performs (weighted) double constrained correspondence analysis (dc-CA)
Description
Double constrained correspondence analysis (dc-CA) for analyzing
(multi-)trait (multi-)environment ecological data using library vegan
and native R code. It has a formula
interface which allows to assess,
for example, the importance of trait interactions in shaping ecological
communities. The function dc_CA
has an option to divide the abundance
data of a site by the site total, giving equal site weights. This division
has the advantage that the multivariate analysis corresponds with an
unweighted (multi-trait) community-level analysis, instead of being weighted
(Kleyer et al. 2012).
Usage
dc_CA(
formulaEnv = NULL,
formulaTraits = NULL,
response = NULL,
dataEnv = NULL,
dataTraits = NULL,
divideBySiteTotals = TRUE,
dc_CA_object = NULL,
verbose = TRUE
)
Arguments
formulaEnv |
formula or one-sided formula for the rows (samples) with
row predictors in |
formulaTraits |
formula or one-sided formula for the columns (species)
with column predictors in |
response |
matrix, data frame of the abundance data
(dimension n x m) or list with community weighted means (CWMs)
from |
dataEnv |
matrix or data frame of the row predictors, with rows
corresponding to those in |
dataTraits |
matrix or data frame of the column predictors, with rows
corresponding to the columns in |
divideBySiteTotals |
logical; default |
dc_CA_object |
optional object from an earlier run of this function.
Useful if the same formula for the columns ( |
verbose |
logical for printing a simple summary (default: TRUE) |
Details
Empty (all zero) rows and columns in response
are removed from the
response
and the corresponding rows from dataEnv
and
dataTraits
. Subsequently, any columns with missing values are
removed from dataEnv
and dataTraits
. It gives an error
('name_of_variable' not found), if variables with missing entries are
specified in formulaEnv
and formulaTraits
.
Computationally, dc-CA can be carried out by a single singular value
decomposition (ter Braak et al. 2018), but it is here computed in two steps.
In the first step, the transpose of the response
is regressed on to
the traits (the column predictors) using cca
with
formulaTraits
. The column scores of this analysis (in scaling 1) are
community weighted means (CWM) of the orthonormalized traits. These are then
regressed on the environmental (row) predictors using wrda
with formulaEnv
or using rda
, if site weights
are equal.
A dc-CA can be carried out on, what statisticians call, the sufficient
statistics of the method. This is useful, when the abundance data are not
available or could not be made public in a paper attempting reproducible
research. In this case, response
should be a list
with as first element community weighted means (CWMs) with respect to the
traits, and the trait data, and, optionally, further elements, for functions
related to dc_CA
. The minimum is a
list(CWM, weight = list(columns = species_weights))
with CWM a matrix
or data.frame, but then formulaEnv
, formulaTraits
,
dataEnv
, dataTraits
must be specified in the call to
dc_CA
. The function fCWM_SNC
and its example
show how to set the
response
for this and helps to create the response
from
abundance data in these non-standard applications of dc-CA. Species and site
weights, if not set in response$weights
can be set by a variable
weight
in the data frames dataTraits
and dataEnv
,
respectively, but formulas should then not be ~.
.
The statistics and scores in the example dune_dcCA.r
, have been
checked against the results in Canoco 5.15 (ter Braak & Šmilauer, 2018).
Value
A list of class
dcca
; that is a list with elements
- CCAonTraits
a
cca.object
from thecca
analysis of the transpose of the closedresponse
using formulaformulaTraits
.- formulaTraits
the argument
formulaTraits
. If the formula was~.
, it was changed to explicit trait names.- data
a list of
Y
,dataEnv
anddataTraits
, after removing empty rows and columns inresponse
and after closure ifdivideBySiteTotals = TRUE
and with the corresponding rows indataEnv
anddataTraits
removed.- weights
a list of unit-sum weights of row and columns. The names of the list are
c("row", "columns")
, in that order.- Nobs
number of sites (rows).
- CWMs_orthonormal_traits
Community weighted means w.r.t. orthonormalized traits.
- RDAonEnv
a
wrda
object orcca.object
from thewrda
or, if with equal row weights,rda
analysis, respectively of the column scores of thecca
, which are the CWMs of orthonormalized traits, using formulaformulaEnv
.- formulaEnv
the argument
formulaEnv
. If the formula was~.
, it was changed to explicit environmental variable names.- eigenvalues
the dc-CA eigenvalues (same as those of the
rda
analysis).- c_traits_normed0
mean, sd, VIF and (regression) coefficients of the traits that define the dc-CA axes in terms of the traits with t-ratios missing indicated by
NA
s for 'tval1'.- inertia
a one-column matrix with four inertias (weighted variances):
total: the total inertia.
conditionT: the inertia explained by the condition in
formulaTraits
if present (neglecting row constraints).traits_explain: the inertia explained by the traits (neglecting the row predictors and any condition in
formulaTraits
). This is the maximum that the row predictors could explain in dc-CA (the sum of the following two items is thus less than this value).conditionE: the trait-constrained inertia explained by the condition in
formulaEnv
.constraintsTE: the trait-constrained inertia explained by the predictors (without the row covariates).
If verbose
is TRUE
(or after out <- print(out)
is
invoked) there are three more items.
-
c_traits_normed
: mean, sd, VIF and (regression) coefficients of the traits that define the dc-CA trait axes (composite traits), and their optimistic t-ratio. -
c_env_normed
: mean, sd, VIF and (regression) coefficients of the environmental variables that define the dc-CA axes in terms of the environmental variables (composite gradients), and their optimistic t-ratio. -
species_axes
: a list with four items-
species_scores
: a list with namesc("species_scores_unconstrained", "lc_traits_scores")
with the matrix with species niche centroids along the dc-CA axes (composite gradients) and the matrix with linear combinations of traits. -
correlation
: a matrix with inter-set correlations of the traits with their SNCs. -
b_se
: a matrix with (unstandardized) regression coefficients for traits and their optimistic standard errors. -
R2_traits
: a vector with coefficient of determination (R2) of the SNCs on to the traits. The square-root thereof could be called the species-trait correlation in analogy with the species-environment correlation in CCA.
-
-
sites_axes
: a list with four items-
site_scores
: a list with namesc("site_scores_unconstrained", "lc_env_scores")
with the matrix with community weighted means (CWMs) along the dc-CA axes (composite gradients) and the matrix with linear combinations of environmental variables. -
correlation
: a matrix with inter-set correlations of the environmental variables with their CWMs. -
b_se
: a matrix with (unstandardized) regression coefficients for environmental variables and their optimistic standard errors. -
R2_env
: a vector with coefficient of determination (R2) of the CWMs on to the environmental variables. The square-root thereof has been called the species-environmental correlation in CCA.
-
All scores in the dcca
object are in scaling "sites"
(1):
the scaling with Focus on Case distances .
References
Kleyer, M., Dray, S., Bello, F., Lepš, J., Pakeman, R.J., Strauss, B., Thuiller, W. & Lavorel, S. (2012) Assessing species and community functional responses to environmental gradients: which multivariate methods? Journal of Vegetation Science, 23, 805-821. doi:10.1111/j.1654-1103.2012.01402.x
ter Braak, CJF, Šmilauer P, and Dray S. 2018. Algorithms and biplots for double constrained correspondence analysis. Environmental and Ecological Statistics, 25(2), 171-197. doi:10.1007/s10651-017-0395-x
ter Braak C.J.F. and P. Šmilauer (2018). Canoco reference manual and user's guide: software for ordination (version 5.1x). Microcomputer Power, Ithaca, USA, 536 pp.
Oksanen, J., et al. (2024) vegan: Community Ecology Package. R package version 2.6-6.1. https://CRAN.R-project.org/package=vegan.
See Also
plot.dcca
, scores.dcca
,
print.dcca
and anova.dcca
Examples
data("dune_trait_env")
# rownames are carried forward in results
rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites
mod <- dc_CA(formulaEnv = ~A1 + Moist + Mag + Use + Manure,
formulaTraits = ~ SLA + Height + LDMC + Seedmass + Lifespan,
response = dune_trait_env$comm[, -1], # must delete "Sites"
dataEnv = dune_trait_env$envir,
dataTraits = dune_trait_env$traits)
anova(mod, by = "axis")
# For more demo on testing, see demo dune_test.r
mod_scores <- scores(mod)
# correlation of axes with a variable that is not in the model
scores(mod, display = "cor", scaling = "sym", which_cor = list(NULL, "X_lot"))
cat("head of unconstrained site scores, with meaning\n")
print(head(mod_scores$sites))
mod_scores_tidy <- scores(mod, tidy = TRUE)
print("names of the tidy scores")
print(names(mod_scores_tidy))
cat("\nThe levels of the tidy scores\n")
print(levels(mod_scores_tidy$score))
cat("\nFor illustration: a dc-CA model with a trait covariate\n")
mod2 <- dc_CA(formulaEnv = ~ A1 + Moist + Mag + Use + Manure,
formulaTraits = ~ SLA + Height + LDMC + Lifespan + Condition(Seedmass),
response = dune_trait_env$comm[, -1], # must delete "Sites"
dataEnv = dune_trait_env$envir,
dataTraits = dune_trait_env$traits)
cat("\nFor illustration: a dc-CA model with both environmental and trait covariates\n")
mod3 <- dc_CA(formulaEnv = ~A1 + Moist + Use + Manure + Condition(Mag),
formulaTraits = ~ SLA + Height + LDMC + Lifespan + Condition(Seedmass),
response = dune_trait_env$comm[, -1], # must delete "Sites"
dataEnv = dune_trait_env$envir,
dataTraits = dune_trait_env$traits, verbose = FALSE)
cat("\nFor illustration: same model but using dc_CA_object = mod2 for speed, ",
"as the trait model and data did not change\n")
mod3B <- dc_CA(formulaEnv = ~A1 + Moist + Use + Manure + Condition(Mag),
dataEnv = dune_trait_env$envir,
dc_CA_object = mod2, verbose= FALSE)
cat("\ncheck on equality of mod3 (from data) and mod3B (from a dc_CA_object)\n",
"the expected difference is in the component 'call'\n ")
print(all.equal(mod3, mod3B)) # only the component call differs