Title: | Nearest Neighbour Contingency Table Analysis |
---|---|
Description: | Function to test spatial segregation and association based in contingency table analysis of nearest neighbour counts following Dixon (2002) <doi:10.1080/11956860.2002.11682700>. Some 'Fortran' code has been included to the original dixon2002() function of the 'ecespa' package to improve speed. |
Authors: | Marcelino de la Cruz Rot and Philip M. Dixon |
Maintainer: | Marcelino de la Cruz Rot <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.0-9 |
Built: | 2024-11-05 06:17:31 UTC |
Source: | CRAN |
dixon
is a wrapper to the functions of Dixon (2002) to test spatial segregation for several species by analyzing the
counts of the nearest neighbour contingency table for a marked point pattern.
dixon(datos, nsim = 99, fortran =TRUE)
dixon(datos, nsim = 99, fortran =TRUE)
datos |
|
nsim |
number of simulations for the randomization approximation of the p-values. |
fortran |
should the fortran implementation be used?. |
A measure of segregation describes the tendency of one species to be associated with itself or with other species. Dixon (2002) proposed a measure of the segregation of species i in a multiespecies spatial pattern as:
where is the number of individuals of species i,
is the frequency of species i as neighbor of especies i and
is the total
number of locations. Values of
larger than 0 indicate that species i is segregated; the larger the value of
, the more extreme the segregation.
Values of
less than 0 indicate that species i is is found as neighbor of itself less than expected under random labelling.
Values of
close to 0 are consistent with random labelling of the neighbors of species i.
Dixon (2002) also proposed a pairwise segregation index for the off-diagonal elements of the contingency table:
is larger than 0 when
, the frequency of neighbors of species j around points of species i, is larger than expected under random
labelling and less than 0 when
is smaller than expected under random labelling.
As a species/neighbor-specific test, Dixon(2002) proposed the statistic
where j may be the same as i and is the expected count in the contingency table. It has an asymptotic normal distribution with mean 0
and variance 1; its asymptotic p-value can be obtained from the numerical evaluation of the cumulative normal distribution; when the sample size is small, a p-value on the observed counts in each cell (
) may be obtained by simulation, i.e, by condicting a randomization test.
An overall test of random labelling (i.e. a test that all counts in the x
nearest-neighbor contingency table are equal to their expected counts) is based
on the quadratic form
where is the vector of all cell counts in the contingency table,
is the variance-covariance matrix of those counts and
is a generalized inverse of
. Under the null hypothesis of random labelling of points,
has a asymptotic Chi-square distribution with
degrees of freedom (if the sample sizes are small its distribution should be estimated using Monte-Carlo simulation). P-values are computed from the probability of observing
equal or larger values of
.
The overall statistic
can be partitioned into
species-specific test statistics
. Each
test if the frequencies of the neighbors
of species i are similar to the expected frequencies if the points were randomly labelled. Because the
are not independent Chi-square statistics, they do not
sum to the overall
.
A list with the following components:
ON |
Observed nearest neighbor counts in table format. From row sp to column sp. |
EN |
Expected nearest neighbor counts in table format. |
Z |
Z-score for testing whether the observed count equals the expected count. |
S |
Segregation measure. |
pZas |
P-values based on the asymptotic normal distribution of the Z statistic. |
pNr |
If nsim !=0, p-values of the observed counts based on the randomization distribution. |
C |
Overall test of random labelling. |
Ci |
Species-specific test of random labelling. |
pCas |
P-value of the overall test from the asymptotic chi-square distribution with the appropriate degrees of freedom. |
pCias |
P-values of the species-specific tests from the asymptotic chi-square distribution with the appropriate degrees of freedom. |
pCr |
If nsim !=0, p-value of the overall test from the randomization distribution. |
pCir |
If nsim !=0, p-values of the species-specific tests from the randomization distribution. |
tablaZ |
table with ON, EN, Z, S, pZas and pNr in pretty format, as in the table II of Dixon (2002). |
tablaC |
table with C, Ci, pCas,pCias, pCr and pCir in pretty format, as in the table IV of Dixon (2002). |
The and
statistics asume that the spatial nearest-neighbor process is stationary, at least to second order,
i.e., have the same sign in every part of the entire plot. A biologically heterogeneous process will violate this asumption.
This function is an improvement of function dixon2002
of the package ecespa. It includes also a small typo correction of the original code.
Philip M. Dixon (Iowa State University). Marcelino de la Cruz Rot wrote the wrapper code and the fortran implementation for this package.
Dixon, P.M. 2002. Nearest-neighbor contingency table analysis of spatial segregation for several species. Ecoscience, 9 (2): 142-151.
De la Cruz, M. 2008. Métodos para analizar datos puntuales. In: Introducción al Análisis Espacial de Datos en Ecología y Ciencias Ambientales: Métodos y Aplicaciones (eds. Maestre, F. T., Escudero, A. and Bonet, A.), pp 76-127. Asociación Española de Ecología Terrestre, Universidad Rey Juan Carlos y Caja de Ahorros del Mediterráneo, Madrid. ISBN: 978-84-9849-308-5.
K012
in the package ecespa for another segregation test, based in the differences of univariate and bivariate -functions.
data(swamp) dixon(swamp,nsim=99)
data(swamp) dixon(swamp,nsim=99)
Computes the p-value for a two-sided hypothesis test following Dixon's (2002:145) description of the method of Agresti & Min (2001).
p2colasr(Z)
p2colasr(Z)
Z |
|
P-value of the two-sided hypothesis test
This function is usually not to be called by the user. It is internally used by dixon
.
Marcelino de la Cruz Rot
Agresti, A. & Min, Y. 2001. On small-sample confidence intervals
for parameters in discrete distributions. Biometrics, 57: 963-971.
Dixon, P.M. 2002. Nearest-neighbor contingency table analysis
of spatial segregation for several species. Ecoscience, 9(2): 142-151.
Locations and botanical classification of trees in a plot in the Savannah River. Locations are given in metres, rounded to the nearest 0.1 metre. The data come from a 1-ha (200 m x 50 m) plot in the Savannah River Site, South Carolina, USA. The 734 mapped stems included 156 Carolina ash (Fraxinus caroliniana), 215 Water tupelo (Nyssa aquatica), 205 Swamp tupelo (Nyssa sylvatica), 98 Bald cypress (Taxodium distichum) and 60 stems of 8 additional species. Although the plots were set up by Bill Good and their spatial patterns described in Good and Whipple(1982), the plots have been maintained and resampled by Rebecca Sharitz and her colleagues of the Savannah River Ecology Laboratory. There are slightly different versions of the Good plot data. Every time the plots are resampled, some errors are corrected. This is mostly a concern for the biologists. The different versions are very similar; they are all very good examples of a marked spatial point pattern.
data(swamp)
data(swamp)
A data frame with 734 observations on the following 3 variables.
x
Cartesian x-coordinate of tree
y
Cartesian y-coordinate of tree
sp
a factor with levels indicating the species of each tree:
FX
|
Carolina ash (Fraxinus caroliniana) |
NS
|
Swamp tupelo (Nyssa sylvatica) |
NX
|
Water tupelo (Nyssa aquatica) |
TD
|
Bald cypress (Taxodium distichum) |
OT
|
Other species |
Dixon, P.M. 2002. Nearest-neighbor contingency table analysis of spatial segregation for several species. Ecoscience, 9(2): 142-151.
Good, , B. J. & Whipple, S.A. 1982. Tree spatial patterns: South Carolina bottomland and swamp forest. Bulletin of the Torrey Botanical Club, 109: 529-536.
Jones et al. 1994. Tree population dynamics in seven South Carolina mixed-species forests. Bulletin of the Torrey Botanical Club, 121:360-368.
data(swamp) plot(swamp$x,swamp$y, col=as.numeric(swamp$sp),pch=19, xlab="",ylab="",main="Swamp forest")
data(swamp) plot(swamp$x,swamp$y, col=as.numeric(swamp$sp),pch=19, xlab="",ylab="",main="Swamp forest")