**A Brief Introduction to**

Spatial Statistics

Spatial Statistics

**Andrew Sanchez Meador, Ph.D.**

Dale, M.R.T., P. Dixon, M.J. Fortin, P. Legendre, D.E. Myers and M.S. Rosenberg. 2002. Conceptual and mathematical relationships among methods for spatial analysis in ecology. Ecography. 25: 558-577.

tools for understanding data distributed in a space where positions and distance have meaning.

Spatial Statistics:

A Definition

Map by Dr. John Snow of London, showing clusters of cholera cases in the 1854 Broad Street cholera outbreak. This is considered the origins of spatial analysis and spatial statisitcs.

All statistical procedures which takes space into consideration are based on one simple principle:

that there is a spatial pattern to the observations -

Waldo Tobler's

first law of geography

Applications of spatial statistics can be found at all levels of ecological inquiry

At coarse scales data can be gathered by remote sensing and is used in fields such as landscape ecology, rangeland ecology, ecosystem ecology

At fine scales data may be molecular and may be used in epidemiology, process modeling, and genetics

What are spatial statistics?

Spatial statistics take the spatial dependence of entities into consideration

They can be used to visualize spatial data

They can be used to explore spatial dependence and how entities change across space

They can be used to model relationships in an explicitly spatial framework

Why do we need them?

Classical statistics assume that observations are independent both in space and time

This assumption is violated in most ecological data

What types of data exist?

Point Patterns – location of an event such as trees in a forest or bird nests in a tree (e.g.,

spatstat

pakage)

Continuous (Sampled) Data – events that change across space such as gradients of precipitation, resources availability or salinity (e.g.,

gstat

)

Area (Lattice) Data – data that can be separated into zones that differ in intensity such as density or number of species within an area or voter preference by state (e.g.,

spdep

)

Data for locations of maples and hickories in a 19.6 acre square plot in the Landsing Woods, Clinton County, Michigan (Diggle 1983 [Gerrard 1969])

Data from Cressie and Chan (1989) depicting rates for Sudden Impact Death Syndrome (SIDS) for each North Carolina County between 1974 and 1978.

Data for locations and top soil heavy metal concentrations (zinc, ppm), collected in a flood plain of the river Meuse, near the village Stein (Burrough and McDonnell 1998).

Fortin, M.-J., James, P.M.A., MacKenzie, A., Melles, S.J., Rayfield, B. 2012. Spatial statistics, spatial regression, and graph theory in ecology. Spatial Statistics 1:100-109

Perry, J.N., Liebhold, A.M., Rosenberg, M.S., Dungan, J., Miriti, M., Jakomulska, A., and Citron Pousty, S. 2002. Illustrations and guidelines for selecting statistical methods for quantifying spatial pattern in ecological data. Ecography 25(5): 578-600

Questions?

Block and quadrat variance methods

These methods require data be collected as a complete census in strings or grids of contiguous quadrats.

The data can be counts of individuals (or such) or records of density such as estimates of cover.

Measures of spatial autocorrelation

The concepts of autocorrelation and autocovariance are derived from the familiar statistical concepts of covariance and correlation. For two variables, x and y, their covariance is related to the expected value of their product:

Their correlation is:

Covariance and

correlation

In geostatistics similar techniques have been developed under different names (Matheron 1962, Rossi et al. 1992). One of the most commonly used geostatistical techniques is the calculation of a sample variogram, which quantifies autocorrelation over a range of lags, d, by estimating what is sometimes called the semivariance, (d).

Semivariance

Autocorrelation

continued

Other measures of autocorrelation are also used, for example (in the same notation) Moran’s index of autocorrelation (Moran 1950) is:

Geary’s measure (Geary 1954) is:

Note: Only the denominator of c(d) makes it different from the equation for the variogram, and therefore there is also a close relationship with the quadrat method called PQV.

Spectral analysis and related techniques

Spectral analysis describes spatial pattern by fitting continuous wave forms to the data (Ripley 1978); the relative magnitudes of the coefficients of the fitted waves with different periods allows one to infer spatial scale.

Spectral analysis

Wavelet analysis is an approach to analyzing spatial data, related to spectral analysis, that uses a finite template or wavelet rather than sine and cosine functions, applied over the length of the data sequence. The analysis proceeds by providing measures of how well the wavelet template, of different sizes and at different positions, matches the data.

Wavelets

Ripley, B.D. 1978. Spectral analysis and the analysis of pattern in plant communities. Journal of Ecology 66:965-981.

10 units

128 units

4-5 units

Point pattern

analyses

First-order effects: variation in the mean value of an event….is there a

trend

?

Second-order effects: spatial dependence of those events….whats the pattern of

pairs of points

?

Univariate K

Bivariate K

Suggested Literature

Baddeley, A., and R. Turner. 2005. Spatstat: an R package for analyzing spatial point patterns. Journal of Statistical Software 15(6):1-42.

Bailey, T.C., and Gatrell, A.C. 1995. Interactive spatial data analysis. Harlow, Essex, England: Longman Group.

Clark, P.J. and F.C. Evans. 1954. Distance to nearest neighbor as a measure of spatial relationships in populations. Ecology 35:445-453.

Dale, M.T. 1999. Spatial Pattern Analysis in Plant Ecology. Cambridge University Press . Cambridge.

Dale, M.R.T., P. Dixon, M.-J. Fortin, P. Legendre, D.E. Myers, and M.S. Rosenberg. 2002. Conceptual and mathematical relationships among methods for spatial analysis. Ecography 25:558-577.

Diggle, P. J. 1983. Statistical analysis of spatial point patterns. Academic press, London, England.

Fortin, M.J., and M.R.T. Dale. 2005. Spatial analysis: a guide for ecologists. Cambridge University Press . Cambridge.

Getis, A., and J. Franklin. 1997. Second-order neighborhood analysis of mapped point patterns. Ecology 68(3): 473-477.

Haase, P. 1995. Spatial pattern analysis in ecology based on Ripley's K-function: introduction and methods of edge correction. Journal of Vegetation Science 6:575-582.

McDonald, R.I., R.K. Peet, and D.L. Urban. 2002. Environmental correlates of oak decline and red maple increase in the North Carolina Piedmont. Castanea 67:84-95.

Moeur, M. 1993. Characterizing spatial patterns of trees using stem-mapped data. Forest Science 39: 756-775

Perry, J.N., A.M. Liebhold, M.S. Rosenberg, J. Dungan, M. Miriti, A. Jakomulska, and S. Citron-Pousty. 2002. Illustrations and guidelines for selecting statistical methods for quantifying spatial pattern in ecological data. Ecography 25:578-600.

Rosenberg, M.S.. 2004. Wavelet analysis for detecting anisotropy in point patterns. Journal of Vegetation Science 15:277-284.

Stoyan, D. and A. Penttinen. 2000. Recent Applications of Point Process Methods in Forestry Statistics. Statistical Science 15(1):61-78.

Upton, G., and B. Fingleton. 1985. Spatial data analysis by example: point pattern and quantitative data. Wiley and Sons, NY.

Deductive vs. Inductive

Spatial statistics can either remove this dependence from data so that classical statistics can be used OR relationships can be modeled using an explicitly spatial approach