4.3. Principal Coordinates Analysis
- The Shepard diagram and stress values can be obtained from:
Overview
0. Intro
Methods for scientific research
Follow along
Required packages
Intro
Ordination in reduced space
4, 5, 6 + Dimensions
Famous 1D Example
2 Dimensions
3 Dimensions
Ordination in reduced space
> install.packages("vegan")
> install.packages("gclus")
> install.packages("ape")
> require(vegan)
> require(gclus)
> require(ape)
> source(file.choose()) #coldiss.R
Ordination?
...
All the cool kids are doing it, what does it mean?
Transformation / Distance
- Have you ever "ordinated"?
- What kind of data?
- Community composition?
- Genetic?
- What software?
- Canoco?
- R? vegan? ecodist? ape?
Recommendations:
- create your own new script
- refer to provided code only if needed
- avoid copy pasting or running the code directly from script
Data and R code available at:
qcbs.ca/wiki/r/r_workshop9
Navigate to the wiki page and download:
- Rscript
- DoubsEnv.csv
- DoubsSpe.csv
- coldiss.R
- Matrix algebra is complex and hard to understand
- A global understanding is enough in order to use ordination methods adequately
# Remember the directory!
What if we are interested in this response for different species of algae involved in the algal bloom density?
1. Exploring Data
2.2 Transforming community data
decostand()
Doubs Environmental Data
Doubs River Fish Dataset
Explore Doubs Dataset
Species Frequencies
Total Species Richness
Species Frequencies
Transforming your community data - considerations
Transforming your community data - examples
Understand your data to choose the appropriate transformation and distance
Visualize how many species are present at each site:
Explore the content of the environmental data:
> ?decostand
Load the Doubs River species data (DoubsSpe.csv):
Take a look at the distribution of species frequencies:
- Transforming counts into presence-absence
Explore the content of the fish community dataset:
> sum(spe==0)
> spe <- read.csv(file.choose(), row.names=1)
> spe <- spe[-8,] #remove site with no data
> site.pres <- rowSums(spe>0)
> barplot(site.pres, main="Species richness",
las=1, xlab="Sites",
ylab="Number of species", col="grey")
Standardizing environmental variables is crucial as you cannot compare the effects of variables with different units:
> (ab=table(unlist(spe)))
> barplot(ab, las=1, xlab=”Abundance class”,
ylab=”Frequency”, col=grey(5:0/5))
Are there many zeros ?
What do they mean ?
> spe.pa <- decostand(spe, method="pa")
It is important that you understand the strengths and limitations of your community data prior to computing ordinations, and standardize and/or transform accordingly:
> names(env) #names of objects
> dim(env) #dimensions
> str(env) #structure of objects
> head(env) #first 5 rows
> summary(env) #summary statistics
> env.z <- decostand(env, method="standardize")
- Reducing the weight of rare species:
Load the Doubs River environmental data (DoubsEnv.csv):
> names(spe) #names of objects
> dim(spe) #dimensions
> str(spe) #structure of objects
> head(spe) #first 5 rows
> summary(spe) #summary statistics
What proportion of zeros?
> spe.hel <- decostand(spe, method="hellinger")
> spe.chi <- decostand(spe, method="chi.square")
Explore collinearity by visualizing correlations between variables:
> env <- read.csv(file.choose(), row.names=1)
> env <- env[-8,] #remove site with no data
Verneaux (1973) dataset:
- characterization of fish communities
- 27 different species
- 30 different sites
- 11 environmental variables
This centers and scales the variables to make your downstream analysis more appropriate:
A measured 0 (i) is not the same than a 0 representing an absence (ii).
(i): 0 mg/L, 0°C ...
(ii): real absence, phenology disturbance, season ...
> sum(spe==0)/(nrow(spe)*ncol(spe))
- relative abundances/counts/presence-absence?
- asymmetrical distributions?
- many rare species?
- overabundance of dominant species?
- double zero problem?
> pairs(env, main="Bivariate Plots of the Environmental Data" )
- Reducing the weight of very abundant species:
> apply(env.z, 2, mean)
> apply(env.z, 2, sd)
# PROCEED WITH CAUTION ONLY EXECUTE ONCE
Distance metrics are very useful for exploring your data, but ordinations preserve more information contained in your dataset
> spe.pa <- decostand(spe, method="log")
2. Similarity/Dissimilarity/Clustering
Schematic diagram of multivariate analysis
Breaking Out of 1D
Similarity/Dissimilarity
Solution #1
2.1 Association measures
Challenge #1
Comparing Doubs Sites
Community distance measures
Matrix X
(envi var.)
Matrix Y
(communauty)
And what about ordination ?
Comparing Doubs Sites
Visualization of distance matrices
standardized / transformed / normalized data
- Matrix algebra is at the heart of all ordinations
- Useful to understand your dataset
- Appropriate measure required by some types of ordinations
The vegdist() function contains all common distances:
- Hellinger
- Chi-square
- Bray-Curtis
[Workshop 10]
[Workshop 9]
Discuss with your neighbour: How can we tell how similar objects are when we have multivariate data?
Make a list of all your suggestions.
Indirect
(unconstrained)
-exploratory-
Direct
(constrained)
-analytic-
> ?vegdist
Similarity: S = 1 - D
Distance: D = 1 - S
- As you have seen, ecological datasets can sometimes be very be large matrices
- Ordinations compute the relationships between species or between sites.
- We can simplify these relationships using measures of dissimilarity
# Each of these will be useful in different situations
How different is the community composition across the 30 sites of the Doubs River?
With ordination methods, we order our objects (site) according to their similarity.
The more the sites are similar, the closer they are in the ordination space (small distance).
In ecology, we usually calculate the similarity between sites according to their species composition or their environmental conditions.
- Exploring various measures of distance between objects provides some understanding of the engine under the hood
> spe.db.pa <- vegdist(spe,method="bray")
Quebec Centre for Biodiversity Science
R Workshop Series
3. Clustering
Workshop 9: Multivariate analyses
How to choose the right method?
Comparison
Ward's method
Complete linkage clustering
Single linkage clustering
Comparison
Hierarchical methods
Clustering
Ward's minimum variance method
Overview of 3 hierarchical methods
- A distance matrix is first sorted in increasing distance order
- Compute the Ward's minimum variance clustering:
- Create a distance matrix from Hellinger transformed Doubs river data
- the two closest objets merge
- the next two objects/cluster will agglomerate when linked to the furthest element of the group
- the two closest objects merge
- the next two closest objects/clusters merge
- and so on.
> spe.dhel.ward <- hclust(spe.dhel, method="ward.D2")
- Elements of lower clusters become members of larger, higher ranking clusters
- e.g. species, genus, family, order...
> spe.dhel <- vegdist(spe.hel, method="euclidean")
- Single linkage agglomerative clustering
- Complete linkage agglomerative clustering
- Ward's minimum variance clustering
- To highlight structures in the data by partitioning either objects or the descriptors
- Results are represented as dendrograms (trees)
- Not a statistical method!
- Uses the criterion of least squares to cluster objects into groups
- at each step, the pair of clusters merging is the one leading to the minimum increase in total within-group sum of squares
- Plot the dendrogram by using the square root of the distances:
- Depends on the objective:
- highlight gradients? contrasts?
- If more than one method seems appropriate --> compare dendrograms
- Again: this is not a statistical method
- But! It is possible to:
- determine the optimal number of interpretable clusters
- compute clustering statistics
- combine clustering to ordination to distinguish groups of sites
- Compute the single linkage clustering:
> spe.dhel.single <- hclust(spe.dhel, method="single")
> plot(spe.dhel.single)
> spe.dhel.ward$height <- sqrt(spe.dhel.ward$height)
> plot(spe.dhel.ward, hang=-1)
Complete linkage
Contrasted groups are formed
- Compute the complete linkage clustering:
hang=-1 aligns objects at the same level (at distance 0)
Single linkage
Chains of objects occur (e.g. 19, 29, 30, 20, 26, etc.)
Clusters generated using this method tend to be more spherical and to contain similar numbers of objects
> spe.dhel.complete <- hclust(spe.dhel, method="complete")
> plot(spe.dhel.complete)
Website: http://qcbs.ca/wiki/r_workshop9
4. Unconstrained ordination
Définitions
Schematic diagram of multivariate analysis
More definitions
Unconstrained Ordination
Matrix X
(envi var.)
Matrix Y
(communauty)
Eigenvalues : Proportion of variance (dispersion) represented by one ordination axe.
standardized / transformed / normalized data
Variance : measure of a variable yj dispersion from its mean
Orthogonality = right angle between 2 axis or 2 arrows which means that these 2 are independent = non correlated.
[Workshop 10]
[Workshop 9]
Indirect
(unconstrained)
-exploratory-
Direct
(constrained)
-analytic-
Covariance : measure of co-dispersion of 2 variables yj et yk from their means
Score = position of a dot on an axis. All the scores of a dot give its coordinates in the multidimensional space. They can be used as new variable for other analyses (e.g. linear combination of measured variables).
- Assess relationships within a set of variables (species or environmental variables, not between sets, i.e. constrained analysis).
- Find key components of variation between samples, sites, species, etc...
- Reduce the number dimensions in multivariate data without substantial loss of information.
- Create new variables for use in subsequent analysis (such as regression)
Correlation : measure of the link strength between 2 variables : rij = (σij / σj . σk)
Dispersion (inertia) : Measure of the total variability of the scatter plot (descriptors) in the multidimensional space with regards to its center of gravity.
QUIZ TIME !!
4.1. Principal Components Analysis
Prime time 4 quiz time
1. What does PCA stand for?
a) Principal Coordinate Analysis
b) Principle Compound Analysis
c) Principal Component Analysis
d) Principle Component Analysis
3.1 Principal Component Analysis
PCA - What you need
PCA - Walkthrough
PCA - Let's try it on Fish Species!
PCA - Interpretation of Output
PCA - Multidimensional case
PCA - Walkthrough
PCA - Interpretation of Output
- A set of variables that are response variables (e.g. community composition) OR explanatory variables (e.g. environmental variables)
- For both PCA and RDA, we will be using the rda() function in the vegan package:
> ?rda
- PC1 --> axis that maximizes the variance of the points that are projected perpendicularly onto the axis.
- PC2 --> must be perpendicular to PC1, but the direction is again the one in which variance is maximized when points are perpendicularly projected.
- PC3 --> and so on: perpendicular to the first two axes.
- List of the eigenvalues associated to each Principal Component (in this output there are 27 PCs, as this is the number of dimensions in the data)
- Run a PCA on the Hellinger-transformed fish data
> spe.h.pca <- rda(spe.hel)
- Preserves, in 2D, the maximum amount of variation in the data
- The resulting synthetic variables are orthogonal (and therefore uncorrelated),
*An eigenvalue is the value of the change in the length of a vector, and for our purposes is the amount of variation explained by Principal Component.
- List of the proportion of variance explained by each Principal Component (as well as cumulative)
- Samples that are measured for the same set of variables
- Generally a dataset that is longer than it is wide is preferred
- Total variance explained by the descriptors (here the fish species)
- In PCA, note that the "Total" and "Unconstrained" portion of the explained variance is identical
When there are more than two dimensions, PCA produces a new space in which all PCA axes are orthogonal (i.e. correlation among any two axes is null) and where the PCA axes are ordered according to the percent of the variance of the original data they explain.
The final plot then is the two PC axes rotated where the axes are now principal components as opposed to species
0.258 + 0.064 + .... = 0.5025 Total explained variance
> summary(spe.h.pca)
In 2D, we would plot the sites like this... Notice the dispersion in the scatterplot
Our first component is essentially drawn through the maximum amount of observed variation... or the best fit line through the points
A second principal component is then added perpendicular (90 degrees in 2D) to the first axis...
Accessing Parts of the Output
Selecting Significant PCs
Kaiser-Guttman criterion
PCA - environmental variables
PCA - Interpretation of Output
PCA - Visualization
Prime time 4 quiz time
PCA - environmental variables
Kaiser-Guttman criterion -visualization
PCA - Interpretation of Output
- Extract the eigenvalues associated to the PCs:
2. Which of the following is the best way to visualize the distances between the community composition of many sites?
a) Principal Component Analysis
b) Principal Coordinate Analysis
c) Principal Response Curves
d) Redundancy Analysis
Selects principal components which capture more variance than the average of all PCs.
We can also run PCAs on standardized environmental variables, to compare sites for example, or how variables are correlated...
You may wish to extract the scores (either from species or sites) for use in subsequent analysis or for plotting purposes :
The output is very dense, but you can access specific information if needed:
> ev <- env.pca$CA$eig
The strength of PCA is that we can condense the variance contained in a huge dataset into a set of synthetic variables that is manageable.
The abundance of information produced by PCA is easier to understand and interpret using biplots to visualize patterns.
- Extract the eigenvalues associated to the PCs:
- Select the all eigenvalues above average:
> n <- length(ev)
> barplot(ev, main="Eigenvalues",
col="grey", las=2)
> abline(h=mean(ev), col="red")
> legend("topright", "Average eigenvalue",
lwd=1, col=2, bty="n")
- Access the species scores along the 1st and 2nd PC:
- Run a PCA on the standardized environmental variables
- Access the eigenvalues associated contribution to variance:
> ev <- spe.h.pca$CA$eig
> ev[ev>mean(ev)]
> env.pca <- rda(env.z)
- We can produce a quick biplot of the species data using function "plot()" in base R:
> spe.scores <- scores(spe.h.pca,
display="species",
choices=c(1,2))
> summary(spe.h.pca, display=NULL)
In our case there are still 27 Principal Components, but only the first few account for any significant amount of variance, while the rest can simply be discarded as noise...
- Select the all eigenvalues above average:
- There are two ways to represent an ordination in 2D, here the output is informing us that it used the default scaling, which is type 2...
- Species refers to your descriptors (i.e. the columns in your dataset), which here are the Fish species
- Scores refer to the position of every species in the PC, they are essentially the coordinates of each species along a principal component...
> ev[ev>mean(ev)]
- Calculate the eigenvalues from scratch:
> plot(spe.h.pca)
- Calculate the eigenvalues from scratch:
> n <- length(ev)
> barplot(ev, main="Eigenvalues",
col="grey", las=2)
> abline(h=mean(ev), col="red")
> legend("topright", "Average eigenvalue",
lwd=1, col=2, bty="n")
- Site refers to your the rows in your dataset, which here are the different sites along the Doubs river (but it can be points in time, etc...)
- Scores are essentially the coordinates of each species along a principal component...
> eigen(cov(spe.hel))
> summary(env.pca)
> summary(env.pca, scaling=2) #default
> site.scores <- scores(spe.h.pca,
display="sites",
choices=c(1,2))
PCA basic biplot with plot()
PCA basic biplot with biplot()
PCA scaling types
Scaling 2 - Scaling 1
Advanced "biplotting"
Solution #3
EASTER EGG: ggvegan
Solution #3
EASTER EGG: rgl and vegan3d
WARNINGS
Solution #3
Challenge #3
- A set of tools for producing biplots using ggplot2:
- Interactive 3D biplots using rgl
- Hellinger transform the species data:
- Using the function biplot() from base R, arrows are plotted to show the directionality and angle of the descriptors in the ordination:
- By extracting specific parts of the PCA output, we can build more detailed and aesthetic plots:
- Type 1 scaling: attempts to preserves the Euclidean distance (in multidimensional space) among objects (sites); the angles among descriptor (species) vectors are meaningless. Best for interpreting relationships among objects (sites)!
> mite.spe.hel <- decostand(mite.spe, method="hellinger")
> require(rgl)
> require(vegan3d)
> ordirgl(spe.h.pca)
Using everything you have learned to execute a PCA on the mite species abundance data:
> install.packages("devtools")
> require("devtools")
> install_github("ggvegan", "gavinsimpson")
> require(ggvegan)
> autoplot()
- PCA is a linear method and thus relies on a few assumptions:
- multinormal distribution of the data (only if you wish to make inferences)
- not too many zeros
- the gradient of interest is causing the majority of the variance in the dataset
> biplot(spe.h.pca)
> data(mite)
> mite.spe.h.pca <- rda(mite.spe.hel)
Prime time 4 quiz time
> plot(spe.h.pca, scaling=1, type="none",
xlab = c("PC1 (%)", round((spe.h.pca$CA$eig[1]/sum(spe.h.pca$CAeig))*100,2)),
ylab = c("PC2 (%)", round((spe.h.pca$CA$eig[2]/sum(spe.h.pca$CAeig))*100,2)))
> points(scores(spe.h.pca, display="sites", choices=c(1,2), scaling=1),
pch=21, col="black", bg="steelblue", cex=1.2)
> text(scores(spe.h.pca, display="species", choices=c(1), scaling=1),
scores(spe.h.pca, display="species", choices=c(2), scaling=1),
labels=rownames(scores(spe.h.pca, display="species", scaling=1)),
col="red", cex=0.8)
Prime time 4 quiz time
- Check significant axes using the Guttman-Kaiser criterion:
3. What does an eigenvalue represent in
PCA?
a) A measure of statistical significance
b) The proportion of variance explained by a principal component
c) Coordinates of objects in reduced space
d) The answer to life, the universe, and everything else (aka 42)
- THE BASIC PLOT FUNCTION IS QUICK BUT IT IS HARD TO INTERPRET THE ANGLES BETWEEN THE SPECIES
- Type 2 scaling (DEFAULT): distances among objects are not approximations of Euclidean distances; angles between descriptor (species) vectors reflect their correlations. Best for interpreting relationships among descriptors (species)!
- Descriptors at 180 degrees of each other are negatively correlated
- Descriptors at 90 degrees of each other have zero correlation
- Descriptors at 0 degrees of each other are positively correlated
Prime time 4 quiz time
# see the arrows() function in {graphics} to add vectors
> ev <- mite.spe.h.pca$CA$eig
> ev[ev>mean(ev)]
> n <- length(ev)
> barplot(ev, main="Eigenvalues", col="grey", las=2)
> abline(h=mean(ev), col="red")
> legend("topright", "Average eigenvalue",
lwd=1,col=2, bty="n")
# Violation of these can cause a horseshoe shape in your biplots, where opposite ends of the horseshoe are close together but in reality represent opposite ends of a gradient
> spe.sc<-scores(spe.h.pca,choices=1:2,scaling=,display="sp")
> arrows(0,0,spe.sc[,1],spe.sc[,2],length=0)
- What are the significant axes?
- Which groups of sites can you identify?
- Which groups of species are related to these groups of sites?
WARNINGS
Live Long and Ordinate
- We can avoid some of these problems in PCA by choosing appropriate transformations for your community composition data or environmental data
- In some cases, such as studies that cover very large environmental gradients, it may be appropriate to use other types of unconstrained ordinations
4.2. Correspondence Analysis
Challenge #4
Solution #4
CA: R output
CA: interpretation of results
CA: interpretation of biplots
CA: biplots
Solution #4
How to run a CA ?
Euclidean vs Chi2 distances
Principles of CA
- CA results are presented in the same way as PCA results and call be called using:
Using everything you have learned to execute a CA on the mite species abundance data:
> mite.spe.ca <- rda(mite.spe)
- CA is implemented in the vegan package using function cca:
> data(mite)
- In such cases, CA should be prefered compared to PCA as it preserves Chi2 distances between sites...and thus better represents unimodal relationships
- Check significant axes using the Guttman-Kaiser criterion:
- These biplots show that a group of sites located in the left part with similar fish community characterized by numerous species such as GAR, TAN, PER, ROT, PSO and CAR; in the upper right corner, an other site cluster characterized by the species LOC, VAI and TRU is identified; the last site cluster in the lower right corner of the biplot is characterized by the species BLA, CHA and OMB.
> spe.ca <- cca(spe)
> summary(spe.ca)
- PCA preserves euclidean distances between objects, and thus postulates linear relationships between species, and between species and environmental gradients
- ...but in some cases, species instead present unimodal responses to environmental gradients
> ev <- mite.spe.ca$CA$eig
> ev[ev>mean(ev)]
> n <- length(ev)
> barplot(ev, main="Eigenvalues", col="grey", las=2)
> abline(h=mean(ev), col="red")
> legend("topright", "Average eigenvalue",
lwd=1,col=2, bty="n")
- CA on fish species abundances
- What are the significant axes?
- Which groups of sites can you identify?
- Which groups of species are related to these groups of sites?
PCoA and non-metric distances
PCoA biplot with biplot.pcoa()
Solution #5
Challenge #5
Solution #5
Principal Coordinate Analysis
PCoA - Let's try it on Fish Species!
PCoA - Interpretation of Output
Solution #5
- Build a biplot to visualize the data:
- PCoA can also be used to capture information contained in non-metric distances, such as the popular Bray-Curtis distance. Let's give it a try:
- Now let's visualize this using a biplot without the species (more common approach for PCoA):
- We can display, in 2D, the distances between sites using the biplot.pcoa() function, as well as the species associated to each site (though this is less common):
- For computing PCoA, we can use the cmdscale() or the pcoa() functions from the {stats} and {ape} packages:
> biplot.pcoa(mite.spe.h.pcoa, mite.spe.hel)
- Hellinger transform the species data:
> biplot.pcoa(spe.bray.pcoa)
Execute a PCoA on the Hellinger-transformed mite species abundance data:
> spe.bray.pcoa <- pcoa(spe.db)
> spe.bray.pcoa
> biplot.pcoa(spe.h.pcoa, spe.hel)
> mite.spe.hel <- decostand(mite.spe, method="hellinger")
> ?cmdscale
> ?pcoa
- Run a PCoA on the Hellinger-distances of the fish dataset:
> mite.spe.h.pcoa<-pcoa(dist(mite.spe.hel))
- Vectors: the eigenvectors associated to each eigenvalue contain the coordinates, in Euclidean space, of each site.
- Examine the output and notice the negative eigenvalues. This is because non-metric distances cannot be represented in Euclidean space without corrections (see Legendre & Legendre 2012 for more details on this):
> spe.h.pcoa <- pcoa(dist(spe.hel))
- What are the significant axes?
- Which groups of sites can you identify?
- Which groups of species are related to these groups of sites?
- How do the PCoA results compare with the PCA results?
- In PCA, we preserve the maximum amount of variation in the data
- In PCoA, we preserve as best we can in 2D the (Euclidean) distances between each object in multidimensional space
# These are the most useful for subsequent analysis as they capture the distance among objects very well.
- Eigenvalues
- Relative Eigenvalues
- Broken stick model: to evaluate which axes are significant
- Cumulative eigenvalues: cumulative value of the relative eigenvalues
- Cumulative broken stick: cumulative value of the broken stick model
> spe.bray.pcoa <- pcoa(spe.db,
correction="cailliez")
> spe.bray.pcoa
> spe.h.pcoa
# PCoA CAN BE ESPECIALLY USEFUL WHEN
THE DATASET IS WIDER THAN IT IS LONG
(TYPICAL PROBLEM IN GENETICS)
4.4. Nonmetric Multidimensional Scaling
Challenge #6
Solution #6
Principles of NMDS
Solution #6
NMDS
NMDS on fish abundances
NMDS: goodness-of-fit
NMDS on fish abundances
- nMDS is implemented in the vegan package using function metaMDS:
> mite.spe.nmds<-metaMDS(mite.spe, distance='bray', k=2)
> spe.nmds<-metaMDS(spe, distance='bray', k=2)
- NMDS
- is the non-metric counterpart of PCoA
- uses an iterative optimization algorithm to find the best representation of distances in reduced space
- increasingly popular
- In NMDS, users can thus specify;
- the number of dimensions
- the distance measure
Run the NMDS of the mite species abundances in 2 dimensions based on a Bray-Curtis distance. Assess the goodness-of-fit of the ordination and interpret the biplot
> spe.nmds$stress
> stressplot(spe.nmds, main='Shepard plot')
- In PCA, CA and PCoA, objects are ordinated in a few number of dimensions (i.e. axis) generally > 2
- Consequently, 2D-biplots can fail to represents all the variation of the dataset
- In some cases, the objective is however to represent the data in a specified small number of dimensions.
- Then, how do you plot the ordination space to represent all the variation in the data ?
> spe.nmds <- (spe, distance='bray', k=2)
- The Shepard plot identifies a strong correlation between observed dissimilarity and ordination distance (R2 > 0.95), highlighting a high goodness-of-fit of the NMDS.
- The biplot of the NMDS shows a group of closed sites characterized by the species BLA, TRU, VAI, LOC, CHA and OMB, while the other species form a cluster of sites in the upper right part of the graph. Four sites in the lower part of the graph are strongly different from the others.
> spe.nmds
> windows()
> plot(spe.nmds, type="none", main=paste('NMDS/Bray - Stress=', round(spe.nmds$stress, 3)), xlab=c("NMDS1"),ylab=c("NMDS2"))
> points(scores(spe.nmds, display="sites", choices=c(1,2)),pch=21, col="black", bg="steelblue", cex=1.2)
> text(scores(spe.nmds, display="species", choices=c(1)), scores(spe.nmds, display="species", choices=c(2)), labels=rownames(scores(spe.nmds, display="species")),col="red", cex=0.8)
- Assess the goodness-of-fit
- The correlation between observed dissimilarity and ordination distance (R2 > 0.91) and the stress value relatively low, showing together a good accuracy of the NMDS ordination.
- No cluster of sites can be precisely defined from the NMDS biplot showing that most of the species occurred in most of the sites, i.e. a few sites shelter specific communities.
> mite.spe.nmds$stress
> stressplot(mite.spe.nmds, main='Shepard plot')>
- Assess the goodness-of-fit
- distance specifies the distance to use
- k specifies the number of dimensions
> spe.nmds$stress
> stressplot(spe.nmds, main='Shepard plot')
Conclusion
Many ordination techniques exist, but their specificity should guide your choices on which methods to use