Title: | Select Point Pattern Models Based on Minimum Contrast, AIC and Goodness of Fit |
---|---|
Description: | Fit and selects point pattern models based on minimum contrast, AIC and and goodness of fit. |
Authors: | Marcelino de la Cruz |
Maintainer: | Marcelino de la Cruz <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.6 |
Built: | 2024-10-31 06:50:50 UTC |
Source: | CRAN |
Performs the Loosmore and Ford (2006) test or the Maximum Absolute Deviation test for a spatial point pattern.
LF.gof(X, rmin=NULL, rmax=NULL, na.rm=TRUE)
LF.gof(X, rmin=NULL, rmax=NULL, na.rm=TRUE)
X |
An object resulting from the function |
rmin |
Minimum value of the function argument r over which the maximum absolute deviation, or the integral, will be computed for the test. |
rmax |
Maximum value of the function argument r over which the maximum absolute deviation, or the integral, will be computed for the test. |
na.rm |
Should NA's be removed to compute the integral? |
These function perform a tests for goodness-of-fit of a point pattern dataset to a point process model, based on Monte Carlo simulation from the model. The simulations should have been previously computed with the function envelope
, applied with the argument savefuns=TRUE
in order to save all the simulated functions, required for the computation of the test.
The test, popularized in the ecological field by Loosmore and Ford (2006) is also described in Diggle (2003, page 14), and according to Baddeley and Turner (2005) also in Diggle (1986) and Cressie (1991, page 667, equation (8.5.42)).
If the arguments rmin
and rmax
are set to NULL
, the integral of the GoF statistics will be computed over the complete range of r values.
A list with the following components:
The GoF statistic, i.e., the value of the integral over the range of r's
The p-value of the test
Number of NA
values for each r. It helps to evaluate the reliability of the computed u's, specially for small r's
Marcelino de la Cruz [email protected]
Cressie, N.A.C. (1991) Statistics for spatial data. John Wiley and Sons, 1991.
Diggle, P. J. (1986). Displaced amacrine cells in the retina of a rabbit : analysis of a bivariate spatial point pattern. J. Neuroscience Methods 18, 115-125.
Diggle, P.J. (2003) Statistical analysis of spatial point patterns, Second edition. Arnold.
Loosmore, N.B. and Ford, E.D. (2006) Statistical inference using the G or K point pattern spatial statistics. Ecology 87, 1925-1931.
dclf.test
for an alternative implementation of the test in spatstat.
Fits Poisson, Poisson cluster, several inhomogeneous Poisson and several inhomogeneous Poisson cluster processes to a spatial point pattern and select the best fitting based on goodness of fit.
select.model.gof(pp, sigmas, r, nlarge = 10000, q = 1/4, p = 2, correction = "trans", sigma2=NULL, rho=NULL, lower=NULL, upper=NULL, parscale=c(1,1), dimyx=c(128,128), nsim=99, seed=1, correct.lambda=10) ## S3 method for class 'selectedmodgof' plot(x,...) ## S3 method for class 'selectedmodgof' print(x,...) ## S3 method for class 'selectedmodgof' envelope(Y,fun=NULL,nrank=1,nsim=99,dimyx=c(128,128),...) ## S3 method for class 'selectedmodgof' simulate(object, nsim=99, seed=1, dimyx=c(128,128),...)
select.model.gof(pp, sigmas, r, nlarge = 10000, q = 1/4, p = 2, correction = "trans", sigma2=NULL, rho=NULL, lower=NULL, upper=NULL, parscale=c(1,1), dimyx=c(128,128), nsim=99, seed=1, correct.lambda=10) ## S3 method for class 'selectedmodgof' plot(x,...) ## S3 method for class 'selectedmodgof' print(x,...) ## S3 method for class 'selectedmodgof' envelope(Y,fun=NULL,nrank=1,nsim=99,dimyx=c(128,128),...) ## S3 method for class 'selectedmodgof' simulate(object, nsim=99, seed=1, dimyx=c(128,128),...)
pp |
Unmarked point pattern with the ppp format of spatstat |
sigmas |
Vector with the sigma values (standard deviations of the Gaussian kernel) for estimating different intensity surfaces with density.ppp. |
r |
Vector of values for the argument r at which the (in)homogeneous K function should be evaluated. First element should be 0. |
nlarge |
A number of points. In spatstat, if the number of points exceeds nlarge, then only the border correction will be computed (by default) for K(r). If you have a large number of points (n) and you want your border correction to be applied, set nlarge > n. |
q |
q parameter of the dtheta (i.e., minimum contrast) function. |
p |
p parameter of the dtheta (i.e., minimum contrast) function. |
correction |
Any selection of the options "border", "bord.modif", "isotropic", "Ripley", "translate", "translation", "none" or "best". It specifies the edge correction(s) to be applied. |
sigma2 |
Starting value in the optimization for the squared standard deviation of the Gaussian dispersion around parent points in the in(homogeneous) Poisson cluster process. |
rho |
Starting value in the optimization for the intensity of parent points in the in(homogeneous) Poisson cluster process. |
lower |
Vector of length two with the lowest allowed values for sigma2 and rho. |
upper |
Vector of length two with the largest allowed values for sigma2 and rho. |
parscale |
Initial values for sigma2 and rho for one of the optimization approaches . |
dimyx |
Pixel array dimensions to estimate the intensity of the point pattern when simulating from inhomogeneous models. |
nsim |
Number of simulated point patterns to be generated for the estimation of the GoF of each fitted model, or when using simulate.selectedmodgof or envelope.selectedmodgof. |
seed |
A single value to set the random generator. |
correct.lambda |
Fraction of the lowest positive intensity value that will be employed to replace intensity estimates == 0. |
... |
Additional arguments passed to the plot, print, envelope and simulate methods. |
x |
An object of class "selectedmodgof", i.e., the result of using function select.model.gof(). |
Y |
An object of class "selectedmodgof", i.e., the result of using function select.model.gof(). |
object |
An object of class "selectedmodgof", i.e., the result of using function select.model.gof(). |
fun |
Function to compute the desired summary statistic for the point pattern. |
nrank |
Integer. Rank of the envelope value amongst the nsim simulated values. A rank of 1 means that the minimum and maximum simulated values will be used. |
select.model.gof is a wrap to fit and select different point processes using standard tools in spatstat and in ecespa. The basic framework consists in choosing among the provided sigma values the bandwith which produces the best fitting of an inhomogeneous Poisson and the one which produces the best fitting of an inhomogeneous Poisson cluster process. The goodness of fit (GoF) of these models are compared to the GoF a homogeneous Poisson process and tho the GoF of a homogeneous Poisson cluster process, and the model with the best fit is returned as the final result. To avoid optimization problems (i.e., obtaining non-realistic parameters for the Poisson cluster models), arguments lower and upper alllow restricting the range of values that these parameters can attain. If these ranges are not set by the user, select.model.gof will select by default the most extreme and sensible values (e.g., from just one cluster to as many clusters as points in the pattern). As the experience shows that different optimization algorithms provide different results, the basic framework is repeated using both the "L-BFGS-B" and the "Nelder-Mead" algorithms of optim. In adition to the initial parameters provided with the argument parscale, an aditional otimization using as parscale the maximum and minimum values of the lower and upper parameters range is also tried with the "L-BFGS-B" algorithm.
The goodness of fit is based on the K function (estimated by Kest or Kinhom) and is evaluated following the approach of Loosmore and Ford (2006) implemented in LF.gof. This implies computing the sum of squared differences between the observed K function and the mean of the K functions of nsim simulations from the fitted models (the u statistic), which means lots of computations and makes the process time comsuming.
Whereas the model selection approach based on AIC implemented in select.model2 is more appropriated for inference purposes, the approach followed by select.model.gof in general is able to select models whose simulations resemble more closely the original pattern.
select.model.gof returns an object of class "selectedmodgof",i.e., a list with components:
gof.u |
vector with the u statistic of each model. |
best.gof |
the minimum u value. |
best.model |
The best of the fitted models. |
models |
vector with the names of the fitted models. |
gof |
A list the results of the LF.gof test applied to the selected model, i.e., the u value and the associated p-value of the GoF test. |
envelopes |
An envelope object with envelopes computed form the best model. |
pp |
The analyzed point pattern, with the ppp format of spatstat. |
best.sigma |
The value of the selected bandwith. |
envelope.selectedmod returns a an object of class "fv", see fv.object, which can be printed and plotted directly. Essentially a data frame containing columns
r |
the vector of values of the argument r at which the summary function fun has been estimated. |
obs |
values of the summary function for the data point pattern. |
lo |
lower envelope of simulations. |
hi |
upper envelope of simulations. |
mmean |
estimated theoretical value of the summary function, computed by averaging simulated values. |
simulate.selectedmod returns a list of nsim point patterns (with the ppp format of spatstat, simulated according to the best fitted model.
Marcelino de la Cruz
Chacon-Labella, J., De la Cruz, M. and Escudero, A. Beyond the classical nurse species effect: diversity assembly in a Mediterranean semiarid dwarf shrubland. Journal of Vegetaation Science. Accepted for publication, July 2015.
Loosmore, N.B. and Ford, E.D. (2006) Statistical inference using the G or K point pattern spatial statistics. Ecology 87, 1925-1931
## Not run: # Get the data data(teucrium) # Define the sequence of r's at which estimate K(r) r<-seq(0,1.5, by=0.01) # Define different standard deviations for the Gaussian kernel # to estimate different intensity surfaces sigmas <- seq(0.5, 3.5, by=0.25) # Fit 28 models (1 Poisson, 1 Poisson cluster, 13 inhomogeneous Poisson # and 13 inhomogeneous Poisson cluster) to teucrium and select the better ones teucrium.model <- select.model.gof(teucrium, sigmas=sigmas, r=r) teucrium.model # Show the empirical K function, # and the envelopes based on this model plot( teucrium.model, sqrt(./pi)-r~r, legend=F, ylab="L(r)", las=1) # Compute and plot envelopes for the pcf function according to the best fitted model. teucrium.env <- envelope(teucrium.model, fun=pcf, nsim=19) plot(teucrium.env, legend=F) # simulate 10 point patterns according to the best fitted model teucrium.simu <- simulate(teucrium.model, nsim=10) teucrium.simu ## End(Not run)
## Not run: # Get the data data(teucrium) # Define the sequence of r's at which estimate K(r) r<-seq(0,1.5, by=0.01) # Define different standard deviations for the Gaussian kernel # to estimate different intensity surfaces sigmas <- seq(0.5, 3.5, by=0.25) # Fit 28 models (1 Poisson, 1 Poisson cluster, 13 inhomogeneous Poisson # and 13 inhomogeneous Poisson cluster) to teucrium and select the better ones teucrium.model <- select.model.gof(teucrium, sigmas=sigmas, r=r) teucrium.model # Show the empirical K function, # and the envelopes based on this model plot( teucrium.model, sqrt(./pi)-r~r, legend=F, ylab="L(r)", las=1) # Compute and plot envelopes for the pcf function according to the best fitted model. teucrium.env <- envelope(teucrium.model, fun=pcf, nsim=19) plot(teucrium.env, legend=F) # simulate 10 point patterns according to the best fitted model teucrium.simu <- simulate(teucrium.model, nsim=10) teucrium.simu ## End(Not run)
Fits Poisson, Poisson cluster, several inhomogeneous Poisson and several inhomogeneous Poisson cluster processes to a spatial point pattern and select the best fitting based on AIC.
select.model2(pp, sigmas, r, nlarge = 10000, q = 1/4, p = 2, correction = "trans") ipc.estK2(mippp, lambda = NULL, correction = "iso", r = NULL, sigma2 = NULL, rho = NULL, q = 1/4, p = 2, nlarge = NULL, ...) aic.function(r, dtheta, npar) ## S3 method for class 'selectedmod' envelope(Y,fun=NULL,nrank=1,nsim=99,dimyx=c(128,128),...) ## S3 method for class 'selectedmod' simulate(object,nsim=99, seed=1,dimyx=c(128,128),...) ## S3 method for class 'selectedmod' plot(x,...) ## S3 method for class 'selectedmod' print(x,...)
select.model2(pp, sigmas, r, nlarge = 10000, q = 1/4, p = 2, correction = "trans") ipc.estK2(mippp, lambda = NULL, correction = "iso", r = NULL, sigma2 = NULL, rho = NULL, q = 1/4, p = 2, nlarge = NULL, ...) aic.function(r, dtheta, npar) ## S3 method for class 'selectedmod' envelope(Y,fun=NULL,nrank=1,nsim=99,dimyx=c(128,128),...) ## S3 method for class 'selectedmod' simulate(object,nsim=99, seed=1,dimyx=c(128,128),...) ## S3 method for class 'selectedmod' plot(x,...) ## S3 method for class 'selectedmod' print(x,...)
pp |
Unmarked point pattern with the ppp format of spatstat |
sigmas |
Vector with the sigma values (standard deviations of the Gaussian kernel) for estimating different intensity surfaces with density.ppp. |
r |
Vector of values for the argument r at which the (in)homogeneous K function should be evaluated. First element should be 0. |
nlarge |
A number of points. In spatstat, if the number of points exceeds nlarge, then only the border correction will be computed (by default) for K(r). If you have a large number of points (n) and you want your border correction to be applied, set nlarge > n. |
q |
q parameter of the dtheta (i.e., minimum contrast) function. |
p |
p parameter of the dtheta (i.e., minimum contrast) function. |
correction |
Any selection of the options "border", "bord.modif", "isotropic", "Ripley", "translate", "translation", "none" or "best". It specifies the edge correction(s) to be applied. |
mippp |
Unmarked point pattern with the ppp format of spatstat |
lambda |
Optional. Values of the estimated intensity function. Either a vector giving the intensity values at the points of the pattern mippp, a pixel image (object of class "im") giving the intensity values at all locations, a fitted point process model (object of class "ppm" or "kppm") or a function(x,y) which can be evaluated to give the intensity value at any location. |
sigma2 |
Starting value in the optimization for the squared standard deviation of the Gaussian dispersion around parent points in the in(homogeneous) Poisson cluster process. |
rho |
Starting value in the optimization for the intensity of parent points in the in(homogeneous) Poisson cluster process. |
... |
Additional arguments passed to the optim function (or to the plot, print, envelope and simulate methods). |
dtheta |
Minimum contrast discrepancy, i.e., sum of squared diferences between the normalized empirical and theoretical K functions. |
npar |
Number of parameters fitted in the model. |
x |
An object of class "selectedmod", i.e., the result of using function select.model2(). |
Y |
An object of class "selectedmod", i.e., the result of using function select.model2(). |
object |
An object of class "selectedmod", i.e., the result of using function select.model2(). |
fun |
Function to compute the desired summary statistic for the point pattern. |
nsim |
Number of simulated point patterns to be generated with simulate.selectedmod or when computing the envelopes with envelope.selectedmod. |
seed |
A single value to set the random generator. |
nrank |
Integer. Rank of the envelope value amongst the nsim simulated values. A rank of 1 means that the minimum and maximum simulated values will be used. |
dimyx |
Pixel array dimensions to estimate the intensity of the point pattern when simulating from inhomogeneous models. |
select.model2 is a wrap to fit and select different point processes using standard tools in of spatstat and of ecespa. ipc.estK2 fits (in)homogeneous models as the function ipc.estK of ecespa but, in addition, it allows seting the argument nlarge and passing options to optim. AIC calculation (actually AICc) is made by aic.function. See more details in Jara et al. (2015).
select.model2 returns an object of class "selectedmod
",i.e., a list with components:
vector with the minimized discrepancy values for each fitted model.
the minimum of the minimized discrepancy values.
The best of the fitted models.
vector with the names of the fitted models
A list with the intensity objects employed to fit each inhomogeneous model
vector with the sigma values (standard deviations of the Gaussian kernel) employed to estimate the different intensity surfaces.
vector with the AIC values of each model.
data.frame with the empirical K functios employed to fit each model.
analyzed point pattern, with the ppp format of spatstat.
ipc.estK
gives an object of class 'ecespa.minconfit
', basically a list with the following components:
Parameter .
Parameter .
Minimized value of the contrast criterion .
Values of the observed K-function.
Values of the fitted K-function.
Sequence of distances at which Kobs
and Kfit
have been estimated.
Original point pattern.
Original intensity object.
Name of the original point pattern.
Name of the original intensity object.
exponent of the contrast criterion.
exponent of the contrast criterion.
aic.function
returns a one-row data.frame, with the following items:
number of observations employed to compute AIC; i.e., number of r values where K(r) was estimated.
number of parameters of the model.
"Resiidual Sum of Squares". It is the value of the discrepacy function Dtheta.
Loglikelihood.
AIC value.
Small sample AIC value.
envelope.selectedmod
returns a an object of class "fv", see fv.object, which can be printed and plotted directly. Essentially a data frame containing columns
the vector of values of the argument r at which the summary function fun has been estimated.
values of the summary function for the data point pattern.
lower envelope of simulations.
upper envelope of simulations.
estimated theoretical value of the summary function, computed by averaging simulated values.
simulate.selectedmod
returns a list of nsim point patterns (with the ppp format of spatstat, simulated according to the best fitted model.
Marcelino de la Cruz
Jara, A., De la Cruz, M., Espinosa, C.I., Mendez, M. & Escudero, A. (2015). Does spatial heterogeneity blur the signature of dispersal syndromes on spatial patterns of woody species? A test in a tropical dry forest. Oikos. http://dx.doi.org/10.1111/oik.02098.
## Not run: # Get the data data(lansing) # Split the multivariate pp in their individual components lansing.sp<-split(lansing) # Define the sequence of r's at which estimate K(r) r<- seq(0,0.25,le=101) # Define different standard deviations for the Gaussian kernel # to estimate different intensity surfaces sigmas<- seq(0.1,1,by=0.05) # Note that lansing is defined in a (0,1) x (0,1) window and this affects # the election of r and sigma values # Fit 40 models (1 Poisson, 1 Poisson cluster, 19 inhomogeneous Poisson # and 19 inhomogeneous Poisson cluster) to maple and select the better ones maple.model <- select.model2(lansing.sp$maple, sigmas=sigmas, r=r) # show the AICc value and the fitted parameters for the best model in each class maple.model # Draw the empirical and theoretical models to visually asses the fitting. # P = Poisson; HPP= heterogeneous (i.e. inhomogeneous) Poisson; # PC = Poisson cluster; HPC=heterogeneous (i.e. inhomogeneous) Poisson cluster plot(maple.model) # Compute and plot envelopes for the K function according to the best fitted model. maple.env <- envelope(maple.model, nsim=19) plot(maple.env, sqrt(./pi)-r~r, legend=F) # simulate 10 point patterns according to the best fitted model maple.simu <- simulate(maple.model, nsim=10) maple.simu # FIt and select models to all species lansing.models<-lapply(lansing.sp, function(x) select.model2(x, sigmas=sigmas, r=r)) lapply(lansing.models, function(x) x) ## End(Not run)
## Not run: # Get the data data(lansing) # Split the multivariate pp in their individual components lansing.sp<-split(lansing) # Define the sequence of r's at which estimate K(r) r<- seq(0,0.25,le=101) # Define different standard deviations for the Gaussian kernel # to estimate different intensity surfaces sigmas<- seq(0.1,1,by=0.05) # Note that lansing is defined in a (0,1) x (0,1) window and this affects # the election of r and sigma values # Fit 40 models (1 Poisson, 1 Poisson cluster, 19 inhomogeneous Poisson # and 19 inhomogeneous Poisson cluster) to maple and select the better ones maple.model <- select.model2(lansing.sp$maple, sigmas=sigmas, r=r) # show the AICc value and the fitted parameters for the best model in each class maple.model # Draw the empirical and theoretical models to visually asses the fitting. # P = Poisson; HPP= heterogeneous (i.e. inhomogeneous) Poisson; # PC = Poisson cluster; HPC=heterogeneous (i.e. inhomogeneous) Poisson cluster plot(maple.model) # Compute and plot envelopes for the K function according to the best fitted model. maple.env <- envelope(maple.model, nsim=19) plot(maple.env, sqrt(./pi)-r~r, legend=F) # simulate 10 point patterns according to the best fitted model maple.simu <- simulate(maple.model, nsim=10) maple.simu # FIt and select models to all species lansing.models<-lapply(lansing.sp, function(x) select.model2(x, sigmas=sigmas, r=r)) lapply(lansing.models, function(x) x) ## End(Not run)
Locations of Teucrium capitatum plants in a dwarf-shrub community in Central Spain. They are part of a more extensive dataset collected and analysed by Chacon-Labella et al. (2015). The coordinates of the plants are given in meters.
data(teucrium)
data(teucrium)
An object of class "ppp" representing the point pattern of tree locations. See ppp.object for details of the format of a ppp object.
Chacon-Labella, J., De la Cruz, M. and Escudero, A. Beyond the classical nurse species effect: diversity assembly in a Mediterranean semiarid dwarf shrubland. Journal of Vegetaation Science. Accepted for publication, July 2015.
data(teucrium) plot(teucrium)
data(teucrium) plot(teucrium)