Discussion:
[ORDNEWS:1835] CCA vs DistLM and autocorrelation among variables in a CCA
Sean Porter
2014-02-07 08:53:02 UTC
Permalink
Dear Colleagues,



Two queries that I would really appreciate your thoughts on:



1. When it comes to correlating environmental data with multivariate
community data what is the preferred technique - canonical correspondence
analysis or distance-based linear modelling (DistLM in the PERMANOVA add-on
package to PRIMER) ?



2. How important is it to remove one environmental variable from a
highly correlated pair of environmental variables before undertaking a
canonical correspondence analysis? Other methods such as DistLM require that
this is done. I read that including correlated variables in a CCA does not
compromise the analysis as the intra-set correlations are not affected (Ter
Braak 1986), but does need to be considered when interpreting the results
(Palmer 1993).





References:

Palmer, M. W. 1993. Putting things in even better order: the advantages of
canonical correspondence analysis. Ecology 74: 2215-2230.

Ter Braak, C. F.J. 1986. Canonical correspondence analysis: a new
eigenvector technique for multivariate direct gradient analysis. Ecology 67:
1167-1179.





Many thanks for your time !



Regards,



DR. SEAN PORTER

Scientist



South African Association for Marine Biological Research

Direct Tel: +27 (31) 328 8169 Fax: +27 (31) 328 8188

E-mail: <mailto:sporter-***@public.gmane.org> sporter-***@public.gmane.org Web:
<http://www.saambr.org.za/> www.saambr.org.za

1 King Shaka Avenue, Point, Durban 4001 KwaZulu-Natal South Africa

PO Box 10712, Marine Parade 4056 KwaZulu-Natal South Africa



cid:image001.jpg-x5oL5u0uZ77ZiaD3SY+***@public.gmane.org
Petr Šmilauer
2014-02-07 10:12:50 UTC
Permalink
Dear Dr Porter,
1.When it comes to correlating environmental data with multivariate community data what is
the preferred technique - canonical correspondence analysis or distance-based linear modelling
(DistLM in the PERMANOVA add-on package to PRIMER)?
I do not think this question can be answered in a meaningful way. Risking my
answer might be considered rude, those using constrained ordination (not just
the CCA) obviously prefer it, and those using DistLM procedure have
a different preference.
But then you have the distance-based RDA, multivariate regression trees,
(generalized) mixed-effect linear models applied to community data,
and several other methods that relate community with environment: all
with similar or identical tasks and each with some specific assumptions and
features that make them preferred and/or make them a target of criticism.
Even if you perform poll on frequency of use, you do not get the right
answer: this will be heavily biased due to tradition in particular research
fields / geographical areas, available publications and available software.
2.How important is it to remove one environmental variable from a highly
correlated pair of environmental variables before undertaking a canonical
correspondence analysis? Other methods such as DistLM require that this
is done. I read that including correlated variables in a CCA does not
compromise the analysis as the intra-set correlations are not affected (Ter Braak
1986), but does need to be considered when interpreting the results (Palmer 1993).
Constrained ordination (including CCA and RDA) are an extension of regression
models. If you have two explanatory variables A and B and they are highly
correlated, then their effects upon response (community) data are unevitably
also highly correlated (i.e. identical to a large extent). This affects some of
the CCA/RDA outputs (canonical coefficients and their T statistics), but not
others (permutation test results for a test of the joint effect of A and B, or
the biplot scores of A and B, usually plotted in ordination diagrams).
The question is, whether using both variables brings you any advantage and
my believe is that in most circumstances it does not.
If you want to illustrate that, say, A is a good predictor and B would be as good
in its place, I would prefer to do a constrained ordination using just A and then
project the B variable as a supplementary variable (using Canoco 5 terminology,
with due apologies;-) into the resulting ordination space.
Because otherwise, if A and B are highly, yet not perfectly correlated,
you get an ordination space with two constrained axes, but only the first one
is important, while the second explains a negligible amount of variation. I would
prefer then the former solution which gives me in standard ordination diagrams
one constrained axis, while the second represents the residual variation,
which I can explore to find some interesting patterns in the community not
related to either A or B.

Best regards from Petr
-------
Petr Smilauer
Ceske Budejovice, CZ
---------------------------------------------------------------
Canoco 5 http://www.canoco5.com and http://www.canoco.com
International course Multivariate Analysis of Ecological Data
(February 2014): http://regent.jcu.cz
Book by course lecturers:
"Multivariate analysis of ecological data using Canoco 5"
at <http://www.cambridge.org/9781107694408>
Martin Weiser
2014-02-07 13:35:34 UTC
Permalink
Dear dr. Porter
Post by Sean Porter
1. When it comes to correlating environmental data with
multivariate community data what is the preferred technique –
canonical correspondence analysis or distance-based linear modelling
(DistLM in the PERMANOVA add-on package to PRIMER) ?
I know nothing (yet) about DistLM, but I am not sure whether direct
ordination technique (like RDA, CCA) is what you are looking for:
they are analogous to regression (they "distort" the multivariate space
according to firm predictors), not correlation. For simple "looking at
the data" (without apriori hypothesis), I suggest indirect ordination
with "passive projection" of environmental variables
(supplementary variables in terms of CANOCO, envfit in terms of vegan)

HTH.

Best,
Martin Weiser

Loading...