Title: | Interactive Add-on Functionality for 'intamap' |
---|---|
Description: | The methods in this package adds to the functionality of the 'intamap' package, such as bias correction and network optimization. Pebesma et al (2010) gives an overview of the methods behind and possible usage <doi:10.1016/j.cageo.2010.03.019>. |
Authors: | Edzer Pebesma [aut], Jon Skoien [aut, cre], Olivier Baume [ctb], A Chorti [ctb], Dionisis Hristopulos [ctb], Stepahnie Melles [ctb], Giannis Spiliopoulos [ctb] |
Maintainer: | Jon Skoien <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.2-6 |
Built: | 2024-12-23 06:45:36 UTC |
Source: | CRAN |
This package provides some added functionality to the
link[intamap]{intamap-package}
for automatic interpolation of environmental
variables. Whereas link[intamap]{intamap-package}
was specifically developed
as a statistical back-end for a Web Processing Service (WPS), this package
offers some functionality that is not possible to access through such a WPS.
The methods in this package can mainly be put into three groups:
methods for estimating and possible correct for biases between measurement networks, due to differences in measurement strategies, measurement devices, or (unknown) post-processing of data
method for segmentation of data, based on their measurement density
methods for optimizing a measurement network (adding or removing observation points), based on different criteria
This function combines segmentation of scattered 2D data and estimation of anisotropy parameters using the CTI method.
anisotropyChoice(object)
anisotropyChoice(object)
object |
An Intamap type object containing one
|
The function AnisotropyChoice
function employs the
doSegmentation
function to
automatically separate the original dataset into clusters based on the sampling density and the spatial
locations of the data (see doSegmentation
for
details). The results of the segmentation procedure and the
anisotropy analysis per cluster are returned in a matrix of
dimension [cl]x5, where [cl] is the number of clusters . Each row of
the matrix contains the cluster index, the anisotropy ratio, the
anisotropy direction, the number of cluster points and the area
inside the convex hull of the cluster. In addition, a single set of
anisotropy parameters is returned in the element anisPar
.
These parameters are calculated using weighted averages of the
covariance Hessian matrix estimates in each cluster. The weights are
based on the area enclosed by the convex hull of each cluster.
object
: A modified Intamap type object is returned,
which contains the results of the anisotropy parameter estimation.
The anisotropy parameters are returned in the element anisPar
as described below.
anisPar |
List element in
|
clusters |
list element added to the original object containing the segmentation results.
|
This function uses the akima
package to perform
"bilinear" and "bicubic" interpolation for the estimation of spatial
derivatives
D.T. Hristopulos, G.Spiliopoulos, A.Chorti
[1] http://www.intamap.org
[2] A. Chorti and D. T. Hristopulos (2008). Non-parametric Identification of Anisotropic (Elliptic) Correlations in Spatially Distributed Data Sets, IEEE Transactions on Signal Processing, 56(10), 4738-4751 (2008).
[3] D. T. Hristopulos, M. P. Petrakis, G. Spiliopoulos, A. Chorti (2009). Non-parametric estimation of geometric anisotropy from environmental sensor network measurements, StatGIS 2009: Geoinformatics for Environmental Surveillance Proceedings (ed. G. Dubois).
library(gstat) data(walker) object=createIntamapObject(observations=walker) object=anisotropyChoice(object) print(summary(object$clusters$index)) print(object$anisPar)
library(gstat) data(walker) object=createIntamapObject(observations=walker) object=anisotropyChoice(object) print(summary(object$clusters$index)) print(object$anisPar)
Identifies and removes biases from measurement networks
biasCorr(object,regCode = "regCode",...)
biasCorr(object,regCode = "regCode",...)
object |
Data frame with observations with same format as |
regCode |
the column name of regions in the data polygons, if existing |
... |
further arguments to the bias correction methods called, see details below |
Many data sets can consist of data coming from a large number of different measurement networks, using different measurement devices or applying different methods for post-processing the observations. Some of these networks can exist in the same area, e.g. when different authorities are measuring the same, but at different locations (one of them in cities, the other one close to lakes), some networks will only exist as neighbouring networks (networks operated by a municipality or a country). Local networks can also be grouped together as one national data-base, which can again be merged into an international data-base.
One challenge with the merging into data-bases is that there will be inconsistencies between measurements in the different networks, which will again cause difficulties when attempting to map the observations, as done in the intamap-package. The intention of this function is therefore to call other functions that are able to identify and remove such differences, which can be referred to as biases between the networks.
There are at the moment two methods available for bias correction, "UK" and "LM".
"UK" is a universal kriging based approach implemented in
findBiasUK
. This method can only deal with biases between
neigbouring networks, but is well capable of taking covariates into account.
"LM" is based on local methods for estimating differences between networks, and
is implemented in findLocalBias
and findRegionalBias
.
The choice between the methods is given by the parameter biasRemovalMethod
in the parameter element of the object, set in getIntamapParams
,
called from createIntamapObject.
The function will remove biases according to the settings of the parameters
removeBias
.
Below is a list of the functions available for bias corrections. See each individual
function for more information about usage.
The universal kriging based function for finding biases between neighbouring networks
Find biases for ovelapping networks
Removes biases between ovelapping networks
Find points that define adjacent boundaries between regions
Find biases for neighbouring networks
Remove biases between neighbouring networks
Data frame with observations, with the identified biases removed.
Jon Olav Skoien
Skoien, J. O., O. P. Baume, E. J. Pebesma, and G. B. M. Heuvelink. 2010. Identifying and removing heterogeneities between monitoring networks. Environmetrics 21(1), 66-84.
data(meuse) data(meuse.grid) observations = data.frame(x = meuse$x,y = meuse$y,value = log(meuse$zinc)) coordinates(observations) = ~x+y gridded(meuse.grid) = ~x+y pBoundaries = spsample(observations, 8, "regular",bb = bbox(observations) + matrix(c(-400,-400,400,400),ncol=2),offset=c(0,0)) gridded(pBoundaries) = TRUE cs = pBoundaries@grid@cellsize[1]/2 dx = cs/5 Srl = list() nb = dim(coordinates(pBoundaries))[1] for (i in 1:nb) { pt1 = coordinates(pBoundaries)[i,] x1 = pt1[1]-cs x2 = pt1[1]+cs y1 = pt1[2]-cs y2 = pt1[2]+cs boun = data.frame(x=c(seq(x1,x2,dx),rep(x2,11),seq(x2,x1,-dx),rep(x1,11)), y=c(rep(y1,11),seq(y1,y2,dx),rep(y2,11),seq(y2,y1,-dx))) coordinates(boun) = ~x+y boun = Polygon(boun) Srl[[i]] = Polygons(list(boun),ID = as.character(i)) } pBoundaries = SpatialPolygonsDataFrame(SpatialPolygons(Srl), data = data.frame(ID=c(1:nb))) observations$ID = over(observations, geometry(pBoundaries)) blines = findBoundaryLines(pBoundaries,regCode = "ID") object = createIntamapObject(observations,meuse.grid,boundaryLines = blines, params = list(removeBias = "regionalBias")) object = biasCorr(object,regCode= "ID") object$regionalBias$regionalBias pBoundaries$bias = NA pBoundaries$bias[object$regionalBias$regionalBias$ID] = object$regionalBias$regionalBias$ols spplot(pBoundaries,"bias",sp.layout = list(list("sp.points",observations)))
data(meuse) data(meuse.grid) observations = data.frame(x = meuse$x,y = meuse$y,value = log(meuse$zinc)) coordinates(observations) = ~x+y gridded(meuse.grid) = ~x+y pBoundaries = spsample(observations, 8, "regular",bb = bbox(observations) + matrix(c(-400,-400,400,400),ncol=2),offset=c(0,0)) gridded(pBoundaries) = TRUE cs = pBoundaries@grid@cellsize[1]/2 dx = cs/5 Srl = list() nb = dim(coordinates(pBoundaries))[1] for (i in 1:nb) { pt1 = coordinates(pBoundaries)[i,] x1 = pt1[1]-cs x2 = pt1[1]+cs y1 = pt1[2]-cs y2 = pt1[2]+cs boun = data.frame(x=c(seq(x1,x2,dx),rep(x2,11),seq(x2,x1,-dx),rep(x1,11)), y=c(rep(y1,11),seq(y1,y2,dx),rep(y2,11),seq(y2,y1,-dx))) coordinates(boun) = ~x+y boun = Polygon(boun) Srl[[i]] = Polygons(list(boun),ID = as.character(i)) } pBoundaries = SpatialPolygonsDataFrame(SpatialPolygons(Srl), data = data.frame(ID=c(1:nb))) observations$ID = over(observations, geometry(pBoundaries)) blines = findBoundaryLines(pBoundaries,regCode = "ID") object = createIntamapObject(observations,meuse.grid,boundaryLines = blines, params = list(removeBias = "regionalBias")) object = biasCorr(object,regCode= "ID") object$regionalBias$regionalBias pBoundaries$bias = NA pBoundaries$bias[object$regionalBias$regionalBias$ID] = object$regionalBias$regionalBias$ols spplot(pBoundaries,"bias",sp.layout = list(list("sp.points",observations)))
Computes mean universal kriging variance (MUKV) for given geostatistical parameters
calculateMukv(observations, predGrid, model, formulaString, fun, ...)
calculateMukv(observations, predGrid, model, formulaString, fun, ...)
observations |
|
predGrid |
|
model |
Variogram model:object of class |
formulaString |
formula that defines the dependent variable as a linear model
of independent variables; suppose the dependent variable has name |
fun |
alternative penalty function, needs to be a function which can take the
same arguments as |
... |
other arguments to be passed on at lower level functions |
This function computes kriging on the predGrid
with
krige
function, and averages the kriging variance (MUKV). With covariates,
the function takes a universal kriging model into account.
MUKV value
S.J. Melles, O. Baume, J. Skoien
# load data: library(gstat) data(meuse) coordinates(meuse) = ~x+y data(meuse.grid) coordinates(meuse.grid) = ~x+y gridded(meuse.grid) = TRUE meuse.grid$soil = factor(meuse.grid$soil) # estimate variogram: smplvarUK = variogram(zinc~dist+ffreq+soil, meuse) plot(smplvarUK) vfitUK = fit.variogram(variogram(zinc~dist+ffreq+soil, meuse), vgm(1, "Exp", 300, 1)) plot(smplvarUK, vfitUK) calculateMukv(meuse, meuse.grid, vfitUK, zinc~dist+ffreq+soil)
# load data: library(gstat) data(meuse) coordinates(meuse) = ~x+y data(meuse.grid) coordinates(meuse.grid) = ~x+y gridded(meuse.grid) = TRUE meuse.grid$soil = factor(meuse.grid$soil) # estimate variogram: smplvarUK = variogram(zinc~dist+ffreq+soil, meuse) plot(smplvarUK) vfitUK = fit.variogram(variogram(zinc~dist+ffreq+soil, meuse), vgm(1, "Exp", 300, 1)) plot(smplvarUK, vfitUK) calculateMukv(meuse, meuse.grid, vfitUK, zinc~dist+ffreq+soil)
This function performs segmentation of scattered 2D data based on sampling density and location.
doSegmentation(object)
doSegmentation(object)
object |
An Intamap type object containing the element (list)
|
This function performs segmentation of scattered 2D data
based on sampling density and location. Let us assume that
No
is the number of observation locations. If No
< 200,
then a single cluster is returned.
(1) The segmentation algorithm first removes isolated distant points, if there are any, from the observation locations. Points $(xi,yi)$ are characterized as 'isolated' and 'distant' if they satisfy the following conditions : $abs(xi-mean(x)) > 4 *std(x) or abs(yi-mean(y)) > 4 *std(y)$ and distance from closest neighbor $> sqrt((std(x)/2)^2+(std(y)/2)^2)$. After the first step the size of the original dataset is reduced to N (N= No - isolated points) points.
(2) A sampling density matrix (lattice) consisting of N cells that cover the study area is constructed. Each cell is assigned a density value equal to the number of observation points inside the cell. In addition, each observation point is assigned the sampling density value of the containing cell.
(3) Unsupervised clustering edge detection is used to determine potential cluster perimeters.
(4) Each closed region's perimeter is labeled with a different cluster (segment) number.
(5) All observation points internal to a cluster perimeter are assigned to the specific cluster.
(6) Each cluster that contains fewer than 50 observation points is rejected.
(7) The observation points that have not initially been assigned to a cluster and those belonging to rejected (small) clusters are assigned at this stage. The assignment takes into account both the distance of the points from the centroids of the accepted clusters as well as the mean sampling density of the clusters.
Note: The No
< 200 empirical constraint is used to avoid
extreme situations in which the sampling density is concentrated
inside a few cells of the background lattice, thereby inhibiting the
edge detection algorithm.
A modified Intamap object which additionally includes the
list element clusters
. This element is a list that contains
(i) the indices of removed points from observations
; (ii)
the indices of the clusters to which the remaining observation
points are assigned
and (iii) the number of clusters detected.
clusters |
list element added to the original object containing the segmentation results.
|
A. Chorti, Spiliopoulos Giannis, Hristopulos Dionisis
[1] D. T. Hristopulos, M. P. Petrakis, G. Spiliopoulos, A. Chorti (2009). Non-parametric estimation of geometric anisotropy from environmental sensor network measurements, StatGIS 2009: Geoinformatics for Environmental Surveillance Proceedings (ed. G. Dubois).
library(gstat) data(walker) # coordinates(walker)=~X+Y object=createIntamapObject(observations=walker) object=doSegmentation(object) print(summary(object$clusters$index))
library(gstat) data(walker) # coordinates(walker)=~X+Y object=createIntamapObject(observations=walker) object=doSegmentation(object) print(summary(object$clusters$index))
Method for identifying regional biases (in most cases biases between countries)
findBiasUK(object)
findBiasUK(object)
object |
an object of class |
A data.frame
with the biases for each country with uncertainty.
Olivier Baume
Method for identifying points on the boundaries between regions (in most cases biases between countries)
findBoundaryLines(polygons, projOrig, projNew, regCode = "regCode")
findBoundaryLines(polygons, projOrig, projNew, regCode = "regCode")
polygons |
A |
projOrig |
The original projection of the boundaries |
projNew |
If a different projection is wanted for the output |
regCode |
the column name of regions in the data polygons |
This function finds the points defining the boundary between two polygons and
passes a SpatialPointsDataFrame
with these points back.
The result in mainly used by findRegionalBias
for estimation
of regional biases. The function is based on the boundary between the
polygons being defined by the same points.
A SpatialPointsDataFrame
with points defining the
boundaries between regions.
Jon Olav Skoien
Skoien, J. O., O. P. Baume, E. J. Pebesma, and G. B. M. Heuvelink. 2010. Identifying and removing heterogeneities between monitoring networks. Environmetrics 21(1), 66-84.
data(meuse) observations = data.frame(x = meuse$x,y = meuse$y,value = log(meuse$zinc)) coordinates(observations) = ~x+y pBoundaries = spsample(observations, 10, "regular", bb = bbox(observations) + matrix(c(-400,-400,400,400),ncol=2),offset=c(0,0)) gridded(pBoundaries) = TRUE cs = pBoundaries@grid@cellsize[1]/2 Srl = list() nb = dim(coordinates(pBoundaries))[1] for (i in 1:nb) { pt1 = coordinates(pBoundaries)[i,] x1 = pt1[1]-cs x2 = pt1[1]+cs y1 = pt1[2]-cs y2 = pt1[2]+cs boun = data.frame(x=c(x1,x2,x2,x1,x1),y=c(y1,y1,y2,y2,y1)) coordinates(boun) = ~x+y boun = Polygon(boun) Srl[[i]] = Polygons(list(boun),ID = as.character(i)) } pBoundaries = SpatialPolygonsDataFrame(SpatialPolygons(Srl), data = data.frame(ID=c(1:nb))) observations$ID = over(observations, geometry(pBoundaries)) blines = findBoundaryLines(pBoundaries, regCode = "ID")
data(meuse) observations = data.frame(x = meuse$x,y = meuse$y,value = log(meuse$zinc)) coordinates(observations) = ~x+y pBoundaries = spsample(observations, 10, "regular", bb = bbox(observations) + matrix(c(-400,-400,400,400),ncol=2),offset=c(0,0)) gridded(pBoundaries) = TRUE cs = pBoundaries@grid@cellsize[1]/2 Srl = list() nb = dim(coordinates(pBoundaries))[1] for (i in 1:nb) { pt1 = coordinates(pBoundaries)[i,] x1 = pt1[1]-cs x2 = pt1[1]+cs y1 = pt1[2]-cs y2 = pt1[2]+cs boun = data.frame(x=c(x1,x2,x2,x1,x1),y=c(y1,y1,y2,y2,y1)) coordinates(boun) = ~x+y boun = Polygon(boun) Srl[[i]] = Polygons(list(boun),ID = as.character(i)) } pBoundaries = SpatialPolygonsDataFrame(SpatialPolygons(Srl), data = data.frame(ID=c(1:nb))) observations$ID = over(observations, geometry(pBoundaries)) blines = findBoundaryLines(pBoundaries, regCode = "ID")
The function tries to identify differences between different networks of observation stations that share a region. From these differences, biases are estimated, and can be removed.
findLocalBias(object, gid = "group", formulaString = value ~ 1, regCode="regCode",...) removeLocalBias(object, localBias, gid = "group", formulaString = value ~ 1, regCode = "regCode")
findLocalBias(object, gid = "group", formulaString = value ~ 1, regCode="regCode",...) removeLocalBias(object, localBias, gid = "group", formulaString = value ~ 1, regCode = "regCode")
object |
data frame with observations |
gid |
name of column identifying groups of local networks |
formulaString |
formula that defines the dependent variable as a linear model
of independent variables; suppose the dependent variable has name |
regCode |
the column name of regions in the |
localBias |
List of data frames, for a single region, or for
each of the regions, each containing
biases for different networks in the region(s), result of
|
... |
arguments to be passed to sub-functions |
findLocalBias
tries to identify biases between overlapping networks, i.e. when
there is no boundary between different networks sampling the same type of data.
This can typically happen if different governmental bodies are responsible for
different types of measurement, e.g. one measuring the situation around populated
areas, the other one measuring close to water bodies.
The function will then try to find the difference between the different networks, and estimate the individual bias for each network, relative to a reference value, usually the average of all networks. The method is not recommended if there can be assumed to be a dependency beteween the process and the networks.
removeLocalBias
removes the bias estimated in findLocalBias
.
From findLocalBias
: A list consisting of one element for each regional
network, or an element single
if only one regional network is apparent. Each of these elements is again a list
consisting of several other elements, where bias
is the interesting one.
The remaining elements are only necessary for debugging purposes. The elements
D, V and Q refers to the matrices with same names in Skoien et al. (2009), i.e.
the relationship matrix, the variance matrix and the difference matrix.
From removeLocalBias
: A SpatialPointsDataFrame
with the biases subtracted.
Jon Olav Skoien
Skoien, J. O., O. P. Baume, E. J. Pebesma, and G. B. M. Heuvelink. 2010. Identifying and removing heterogeneities between monitoring networks. Environmetrics 21(1), 66-84.
# Assuming that the soil type is the source of biases data(meuse) coordinates(meuse) = ~x+y lb = findLocalBias(meuse,gid = "soil",formulaString=as.formula(zinc~1)) lb$single$bias meuseUnbias = removeLocalBias(meuse,localBias = lb, gid = "soil", formulaString = zinc~1)
# Assuming that the soil type is the source of biases data(meuse) coordinates(meuse) = ~x+y lb = findLocalBias(meuse,gid = "soil",formulaString=as.formula(zinc~1)) lb$single$bias meuseUnbias = removeLocalBias(meuse,localBias = lb, gid = "soil", formulaString = zinc~1)
Method for identifying regional biases (in most cases biases between countries)
findRegionalBias(object,boundaryLines, formulaString = value~1, minKrige = 5, regCode = "regCode", unbias = "default") removeRegionalBias(object, regionalBias, formulaString = value~1, regCode = "regCode")
findRegionalBias(object,boundaryLines, formulaString = value~1, minKrige = 5, regCode = "regCode", unbias = "default") removeRegionalBias(object, regionalBias, formulaString = value~1, regCode = "regCode")
object |
an object of class |
boundaryLines |
|
formulaString |
formula that defines the dependent variable as a linear model
of independent variables; suppose the dependent variable has name |
minKrige |
Setting a minimum number of observations necessary for kriging |
regCode |
the column name of regions in the data polygons, if existing |
unbias |
defines if a particular data dependent function should be used to set unbiasedness constraints for the biases. "default" gives one additional constraint, assuming that the average of the biases should be equal to zero. See also details below. |
regionalBias |
List of data frames, one for each region, each containing biases for different networks in the region. |
This methods attempts to find biases between regional networks that are
separated by a boundary, based on line kriging along these boundaries.
A typical example of such networks would be different national networks,
with the country borders as boundaryLines
, but also other
boundaries can be considered. Further details can be found in Skoien et al. (2009).
The parameter unbias can be used to name the unbiasedness function if the user needs a different unbiasedness constraint than the default one. Such a function (with unbias = "new" above) should be similar to the following:
unBias.new = function(cDiff,uRegCode) { D = cDiff$D Q = cDiff$Q V = cDiff$V # D = rbind(D,0) cd = dim(D)[1] ino = which(uRegCode == "NO") iis = which(uRegCode == "IS") iuk = which(uRegCode == "UK" | uRegCode == "GB") if (length(iis) > 0) { D[cd,ino] = .5 D[cd,iuk] = .5 D[cd,iis]= -1 Q[cd] = 0 V[cd] = max(V) cd = cd+1 D = rbind(D,0) } cd = cd + 1 D = rbind(D,0) D[cd,] = 1 Q[cd] = 0 V[cd] = min(V) cDiff$D = D cDiff$Q = Q cDiff$V = V return(cDiff) }
The last part is similar to unbias.default. In the other part is solving the problem where there are no boundaries between Iceland and any other countries. This would cause a missing constraint when searching for the biases, which will make it impossible to find a solution. The solution here sets the bias for Iceland equal to the average of the bias for Norway and United Kingdom. Note that the real bias for Iceland is not really estimated in this case, this construction is mainly to make sure that the system can be solved. If one were only interested in the bias, it would in this case be better to remove Iceland from the data set, as a real bias is not possible to find.
For findRegionalBias
; a data.frame
with the biases for each country with uncertainty.
For removeRegionalBias
; a data.frame
with observations, with biases removed
Jon Olav Skoien
Skoien, J. O., O. P. Baume, E. J. Pebesma, and G. B. M. Heuvelink. 2010. Identifying and removing heterogeneities between monitoring networks. Environmetrics 21(1), 66-84.
library(intamapInteractive) data(meuse) observations = data.frame(x = meuse$x,y = meuse$y,value = log(meuse$zinc)) coordinates(observations) = ~x+y pBoundaries = spsample(observations, 10, "regular",bb = bbox(observations) + matrix(c(-400,-400,400,400),ncol=2),offset=c(0,0)) gridded(pBoundaries) = TRUE cs = pBoundaries@grid@cellsize[1]/2 Srl = list() nb = dim(coordinates(pBoundaries))[1] for (i in 1:nb) { pt1 = coordinates(pBoundaries)[i,] x1 = pt1[1]-cs x2 = pt1[1]+cs y1 = pt1[2]-cs y2 = pt1[2]+cs boun = data.frame(x=c(x1,x2,x2,x1,x1),y=c(y1,y1,y2,y2,y1)) coordinates(boun) = ~x+y boun = Polygon(boun) Srl[[i]] = Polygons(list(boun),ID = as.character(i)) } pBoundaries = SpatialPolygonsDataFrame(SpatialPolygons(Srl), data = data.frame(ID=c(1:nb))) observations$ID = over(observations, geometry(pBoundaries)) blines = findBoundaryLines(pBoundaries, regCode = "ID") rb = findRegionalBias(observations, blines, value~1, regCode = "ID") rb$regionalBias obs2 = removeRegionalBias(observations, rb, value~1, regCode = "ID")
library(intamapInteractive) data(meuse) observations = data.frame(x = meuse$x,y = meuse$y,value = log(meuse$zinc)) coordinates(observations) = ~x+y pBoundaries = spsample(observations, 10, "regular",bb = bbox(observations) + matrix(c(-400,-400,400,400),ncol=2),offset=c(0,0)) gridded(pBoundaries) = TRUE cs = pBoundaries@grid@cellsize[1]/2 Srl = list() nb = dim(coordinates(pBoundaries))[1] for (i in 1:nb) { pt1 = coordinates(pBoundaries)[i,] x1 = pt1[1]-cs x2 = pt1[1]+cs y1 = pt1[2]-cs y2 = pt1[2]+cs boun = data.frame(x=c(x1,x2,x2,x1,x1),y=c(y1,y1,y2,y2,y1)) coordinates(boun) = ~x+y boun = Polygon(boun) Srl[[i]] = Polygons(list(boun),ID = as.character(i)) } pBoundaries = SpatialPolygonsDataFrame(SpatialPolygons(Srl), data = data.frame(ID=c(1:nb))) observations$ID = over(observations, geometry(pBoundaries)) blines = findBoundaryLines(pBoundaries, regCode = "ID") rb = findRegionalBias(observations, blines, value~1, regCode = "ID") rb$regionalBias obs2 = removeRegionalBias(observations, rb, value~1, regCode = "ID")
Optimizes the sampling design of observation point locations using a varity of methods including spatial coverage
by k
means (as described in spcosa
) or by maximizing nearest neighbour distances
and spatial simulated annealing (SSA, as described in ssaOptim
) using MUKV as a criterion (calculateMukv
) .
optimizeNetwork(observations, predGrid, candidates, method, action, nDiff, model, criterion = "MUKV", plotOptim = TRUE, nGridCells, nTry, nr_iterations = 10000, formulaString, fun, ...)
optimizeNetwork(observations, predGrid, candidates, method, action, nDiff, model, criterion = "MUKV", plotOptim = TRUE, nGridCells, nTry, nr_iterations = 10000, formulaString, fun, ...)
observations |
object of class |
predGrid |
object of class |
candidates |
when |
method |
|
action |
character string indicating which action to perform:
|
nDiff |
number of stations to add or delete |
model |
variogram model to consider when |
criterion |
Only in use for method |
plotOptim |
logical; if TRUE, creates a plot of the result as optimization progresses; TRUE by default |
nGridCells |
when method is |
nTry |
when method is |
nr_iterations |
number of iterations to process before stoping. The default coolingFactor in |
formulaString |
When |
fun |
Alternative objective function for optimization, the input and output should match
the ones of ( |
... |
other arguments to be passed on to lower level functions |
This function contains different methods to optimally add or remove point locations to or from a measurement network (Baume et al. 2011). Points can be added or deleted in the following ways:
manually
using a spatial coverage approach by k
means to add stations
(as described in spcosa
, Brus et al. 2006)
using a spatial coverage approach by maximizing mean nearest
neighbour distances to remove stations (as described in spCovDel
)
or using spatial simulated annealing
with mean universal kriging variance as a criterion (calculateMukv
,
Brus & Heuvelink 2007, Melles et al. 2011)
The results of different methods can be checked using the function calculateMukv
,
which returns mean universal kriging variance for an optimized network.
The user should be aware of the following limitations:
method = "ssa"
is only implemented for criterion = "mukv"
Input candidates
should preferably be a continuous domain such
as SpatialPolygons
method = "ssa"
with criterion = "mukv"
makes it possible to assume a linear relationship between
independent variables in predGrid and dependent variables at observation locations using
universal kriging (krige
). However, a correct estimate of
mean universal kriging variance requires that the independent
covariate variables be known
at candidate locations. Thus it is necessary to have complete spatial
coverage for all covariate predictors
in the linear model. Covariate information must be available at both
new candidate measurement locations and
prediction locations. This information is acquired (or sampled) from predGrid at
candidate locations during SSA using a call
to over
by default. But see ssaOptim
for more details and an option to interpolate
these values for candidate locations from predGrid.
Note that it is not recommended to use independent variables which differ strongly in magnitude (as for traditional universal kriging)
If no formulaString
is supplied, an ordinary kriging formula is assumed, and
optimization will proceed using mean ordinary kriging variance
Object of class SpatialPoints
* with spatial coordinates
of optimized locations (including observation locations when action = "add"
)
O. Baume, S.J. Melles, J. Skoien
O. P. Baume, A. Gebhardt, C. Gebhardt, G. B. M. Heuvelink, J. Pilz (2011). Network optimization algorithms and scenarios in the context of automatic mapping, Computers and Geosciences, 37: 289-294 (2011).
S. J. Melles, G. B. M. Heuvelink, C. J. W. Twenhofel, U. Stohlker (2011). Optimizing the spatial pattern of networks for monitoring radioactive releases, Computers and Geosciences, 37: 280-288 (2011).
D. J. Brus, G. B. M. Heuvelink (2007). Optimization of sample patterns for universal kriging of environmental variables, Geoderma, 138: 86-95 (2007).
D. J. Brus, J. de Gruijter, J. van Groenigen (2006). Designing spatial coverage samples using the k-means clustering algorithm. In A. McBratney M. Voltz and P. Lagacherie, editor, Digital Soil Mapping: An Introductory Perspective, Developments in Soil Science, vol. 3., Elsevier, Amsterdam.
ssaOptim
, spCovDel
, spCovAdd
, calculateMukv
,
stratify
# load data: library(gstat) data(meuse) coordinates(meuse) = ~x+y data(meuse.grid) coordinates(meuse.grid) = ~x+y gridded(meuse.grid) = TRUE predGrid = meuse.grid # estimate variograms (OK/UK): vfitOK = fit.variogram(variogram(zinc~1, meuse), vgm(1, "Exp", 300, 1)) vfitUK = fit.variogram(variogram(zinc~x+y, meuse), vgm(1, "Exp", 300, 1)) vfitRK = fit.variogram(variogram(zinc~dist+ffreq+soil, meuse), vgm(1, "Exp", 300, 1)) # study area of interest: bb = bbox(predGrid) boun = SpatialPoints(data.frame(x=c(bb[1,1],bb[1,2],bb[1,2],bb[1,1],bb[1,1]), y=c(bb[2,1],bb[2,1],bb[2,2],bb[2,2],bb[2,1]))) Srl = Polygons(list(Polygon(boun)),ID = as.character(1)) candidates = SpatialPolygonsDataFrame(SpatialPolygons(list(Srl)), data = data.frame(ID=1)) # add 20 more points assuming OK model (SSA method): optimOK <- optimizeNetwork(meuse, meuse.grid, candidates = candidates, method= "ssa", action= "add", nDiff = 20, model = vfitOK, criterion="MUKV", nr_iterations=10000, nmax=40) # add 20 more points assuming UK model (SSA method): optimUK <- optimizeNetwork(meuse, meuse.grid, candidates = candidates, method = "ssa", action = "add", nDiff = 20, model=vfitUK, criterion="MUKV", nr_iterations = 10000, nmax = 40, formulaString = zinc~x+y) # add 20 more points with auxiliary variables (SSA method): optimRK <- optimizeNetwork(meuse, meuse.grid, candidates=candidates, method="ssa", action="add", nDiff=4, model=vfitRK, criterion="MUKV", nr_iterations=10000, formula=zinc~dist+ffreq+soil, nmax=200) # add optimally 20 stations from current network with method "spcov" # (spatial coverage method) optimSC = optimizeNetwork(meuse, meuse.grid, candidates, method = "spcov", action = "add", nDiff = 10, model = model, criterion = "MUKV", plotOptim = TRUE, nGridCells = 10000,nTry = 100 ) # delete optimally 10 stations from current network with method "manual" if (interactive()) optimMAN = optimizeNetwork(meuse, meuse.grid, candidates, method = "manual", action = "del", nDiff = 10, model = model, criterion = "MUKV", plotOptim = TRUE ) # comparison of results with ordinary kriging variogram, otherwise add formulaString # ssa method, assuming ordinary kriging calculateMukv(optimOK, predGrid, vfitOK) # ssa method, using spatial location as covariates calculateMukv(optimUK, predGrid, vfitUK, zinc~x+y) # ssa method, using other variables as covariates calculateMukv(optimRK, predGrid, vfitRK, zinc~dist+ffreq+soil) # spcov method calculateMukv(optimSC, predGrid, vfitOK) # 10 stations manually deleted if (interactive()) calculateMukv(optimMAN, predGrid, vfitOK, zinc~1)
# load data: library(gstat) data(meuse) coordinates(meuse) = ~x+y data(meuse.grid) coordinates(meuse.grid) = ~x+y gridded(meuse.grid) = TRUE predGrid = meuse.grid # estimate variograms (OK/UK): vfitOK = fit.variogram(variogram(zinc~1, meuse), vgm(1, "Exp", 300, 1)) vfitUK = fit.variogram(variogram(zinc~x+y, meuse), vgm(1, "Exp", 300, 1)) vfitRK = fit.variogram(variogram(zinc~dist+ffreq+soil, meuse), vgm(1, "Exp", 300, 1)) # study area of interest: bb = bbox(predGrid) boun = SpatialPoints(data.frame(x=c(bb[1,1],bb[1,2],bb[1,2],bb[1,1],bb[1,1]), y=c(bb[2,1],bb[2,1],bb[2,2],bb[2,2],bb[2,1]))) Srl = Polygons(list(Polygon(boun)),ID = as.character(1)) candidates = SpatialPolygonsDataFrame(SpatialPolygons(list(Srl)), data = data.frame(ID=1)) # add 20 more points assuming OK model (SSA method): optimOK <- optimizeNetwork(meuse, meuse.grid, candidates = candidates, method= "ssa", action= "add", nDiff = 20, model = vfitOK, criterion="MUKV", nr_iterations=10000, nmax=40) # add 20 more points assuming UK model (SSA method): optimUK <- optimizeNetwork(meuse, meuse.grid, candidates = candidates, method = "ssa", action = "add", nDiff = 20, model=vfitUK, criterion="MUKV", nr_iterations = 10000, nmax = 40, formulaString = zinc~x+y) # add 20 more points with auxiliary variables (SSA method): optimRK <- optimizeNetwork(meuse, meuse.grid, candidates=candidates, method="ssa", action="add", nDiff=4, model=vfitRK, criterion="MUKV", nr_iterations=10000, formula=zinc~dist+ffreq+soil, nmax=200) # add optimally 20 stations from current network with method "spcov" # (spatial coverage method) optimSC = optimizeNetwork(meuse, meuse.grid, candidates, method = "spcov", action = "add", nDiff = 10, model = model, criterion = "MUKV", plotOptim = TRUE, nGridCells = 10000,nTry = 100 ) # delete optimally 10 stations from current network with method "manual" if (interactive()) optimMAN = optimizeNetwork(meuse, meuse.grid, candidates, method = "manual", action = "del", nDiff = 10, model = model, criterion = "MUKV", plotOptim = TRUE ) # comparison of results with ordinary kriging variogram, otherwise add formulaString # ssa method, assuming ordinary kriging calculateMukv(optimOK, predGrid, vfitOK) # ssa method, using spatial location as covariates calculateMukv(optimUK, predGrid, vfitUK, zinc~x+y) # ssa method, using other variables as covariates calculateMukv(optimRK, predGrid, vfitRK, zinc~dist+ffreq+soil) # spcov method calculateMukv(optimSC, predGrid, vfitOK) # 10 stations manually deleted if (interactive()) calculateMukv(optimMAN, predGrid, vfitOK, zinc~1)
This function spCovAdd allows to build optimization scenarios based on spatial coverage method.
spCovAdd( observations, candidates, nDiff, nGridCells, plotOptim = TRUE, nTry, ... )
spCovAdd( observations, candidates, nDiff, nGridCells, plotOptim = TRUE, nTry, ... )
observations |
object of class |
candidates |
a |
nDiff |
number of stations to add or delete |
nGridCells |
number of grid cells to work on spatial coverage strafication |
plotOptim |
logical; to plot the result or not |
nTry |
the method will try |
... |
other arguments to be passed on at lower level functions such as
|
This function allows to build optimization scenarios based on spatial coverage method.
The scenario action is "add". To add new measurement locations to the running network,
the function uses function stratify
from package spcosa
.
Function stratify adds new strata to the domain study.
data.frame
of optimized locations
Olivier Baume
D. J. Brus, J. de Gruijter, J. van Groenigen (2006). Designing spatial coverage samples using the k-means clustering algorithm. In A. McBratney M. Voltz and P. Lagacherie, editor, Digital Soil Mapping: An Introductory Perspective, Developments in Soil Science, vol. 3., Elsevier, Amsterdam.
The function spCovDel allows to build optimization scenarios based on spatial coverage method.
spCovDel(observations, candidates, nDiff, plotOptim = TRUE, ...)
spCovDel(observations, candidates, nDiff, plotOptim = TRUE, ...)
observations |
object of class |
candidates |
not compulsory used only for plotting purpose – a
|
nDiff |
number of stations to add or delete |
plotOptim |
logical; to plot the result or not |
... |
other arguments to be passed on at lower level functions such as
|
This function allows to build optimization scenarios based on spatial coverage method.
When action is "del", the function maximizes the mean distance of measurements with direct neighbours using function
nndist
. The heuristic search uses a
swapping algorithm to converge more rapidly to the best solution.
data.frame
of optimized locations
Olivier Baume
Spatial simulated annealing uses slight perturbations of previous sampling designs and a random search technique to solve spatial optimization problems.
Candidate measurement locations are iteratively moved around and optimized by minimizing the mean universal kriging variance (calculateMukv
). The approach relies on a known, pre-specified
model for underlying spatial variation (variogramModel
).
ssaOptim(observations, predGrid, candidates, action, nDiff, model, nr_iterations, plotOptim = TRUE, formulaString = NULL, coolingFactor = nr_iterations/10, covariates = "over", fun, ...)
ssaOptim(observations, predGrid, candidates, action, nDiff, model, nr_iterations, plotOptim = TRUE, formulaString = NULL, coolingFactor = nr_iterations/10, covariates = "over", fun, ...)
observations |
object of class |
predGrid |
object of class |
candidates |
candidates is the study area of class
|
action |
character string indicating which type of action to perform:
|
nDiff |
number of stations to add or delete |
model |
variogram model:object of class |
nr_iterations |
number of iterations to process before stopping. The default
|
plotOptim |
logical; if TRUE, creates a plot of the result as optimization progresses; TRUE by default |
formulaString |
formula that defines the dependent variable as a linear model
of independent variables; suppose the dependent variable has name |
coolingFactor |
variable defining how fast the algorithm will cool down. With SSA,
worsening designs are accepted with a decreasing probability (generally set to
|
covariates |
character string defining whether possible covariates should be found by "over" or "krige", see also details below |
fun |
Alternative objective function for optimization, the input and output should match
the ones of ( |
... |
other arguments to be passed on to lower level functions |
The default version of this function applies spatial simulated annealing for
optimization with the MUKV criterion (calculateMukv
).
With covariates, the function takes a universal kriging model into account.
When optimizing a sampling design using SSA and criterion = "mukv"
,
measurement values at new sampling locations are not required in order to
calculate prediction error variance (criterion = "mukv"
).
This attractive property allows one to estimate the kriging prediction error
variance prior to collecting the data (i.e., the dependent variable
is unknown at new candidate locations), and it is this property that is used in
the SSA optimization procedure after (Brus & Heuvelink 2007, Melles et al. 2011).
A stopping criterion countMax
is implemented in lower level functions to
end the optimization procedure after 200 search steps
have occurred without an improvement in the design. If this stopping criterion
is reached before nr_iterations
, SSA will terminate.
method = "ssa"
with criterion = "mukv"
makes it possible to assume
a linear relationship between independent variables in predGrid
and dependent variables at observation locations using universal kriging
(krige
). However, a correct estimate of mean universal
kriging variance requires that the independent
covariate variables be
known at candidate locations. Thus it is necessary to have complete spatial
coverage for all covariate predictors in the linear model. Covariate information
must be available at both new candidate measurement locations and prediction locations.
This is not possible (except for the measurement locations themselves).
Instead, these are estimated from the prediction locations.
There are two possible methods to attain information on covariates at the
candidate locations, and the method can be set using the argument covariates
:
over
and krige
. over
finds the value of
covariates at new locations by overlaying candidate locations on the prediction grid
and taking the value of the nearest neighbour. The second method uses kriging to
estimate covariate values at new locations from predGrid. The first method is
generally faster, the second method is most likely more exact, particularly if
the resolution of predGrid is low in relation to the spatial correlation lengths of the covariates.
Both methods are approximations that may influence the criterion used for
optimization with increasing numbers of points added.
It is possible to submit an alternative function fun
as objective function.
This function should take at least the observation locations and the predGrid as input, and
return a value which should be minimized. See also calculateMukv
for more information
about arguments to this function.
SpatialPointsDataFrame with optimized locations
O. Baume, S.J. Melles, J. Skoien
D. J. Brus, G. B. M. Heuvelink (2007). Optimization of sample patterns for universal kriging of environmental variables, Geoderma, 138: 86-95 (2007).
S. J. Melles, G. B. M. Heuvelink, C. J. W. Twenhofel, U. Stohlker (2011). Optimizing the spatial pattern of networks for monitoring radioactive releases, Computers and Geosciences, 37: 280-288 (2011).
# load data: library(gstat) data(meuse) coordinates(meuse) = ~x+y data(meuse.grid) coordinates(meuse.grid) = ~x+y gridded(meuse.grid) = TRUE predGrid = meuse.grid # estimate variograms (OK/UK): vfitOK = fit.variogram(variogram(zinc~1, meuse), vgm(1, "Exp", 300, 1)) vfitUK = fit.variogram(variogram(zinc~x+y, meuse), vgm(1, "Exp", 300, 1)) vfitRK = fit.variogram(variogram(zinc~dist+ffreq+soil, meuse), vgm(1, "Exp", 300, 1)) # study area of interest: bb = bbox(predGrid) boun = SpatialPoints(data.frame(x=c(bb[1,1],bb[1,2],bb[1,2],bb[1,1],bb[1,1]), y=c(bb[2,1],bb[2,1],bb[2,2],bb[2,2],bb[2,1]))) Srl = Polygons(list(Polygon(boun)),ID = as.character(1)) candidates = SpatialPolygonsDataFrame(SpatialPolygons(list(Srl)), data = data.frame(ID=1)) # add 20 more points assuming OK model (SSA method): optimOK <- ssaOptim(meuse, meuse.grid, candidates = candidates, covariates = "over", nDiff = 20, action = "add", model = vfitOK, nr_iterations = 10000, formulaString = zinc~1, nmax = 40, countMax = 200) # add 20 more points assuming UK model (SSA method): optimUK <- ssaOptim(meuse, meuse.grid, candidates = candidates, covariates = "over", nDiff = 20, action = "add", model = vfitUK, nr_iterations = 10000, formulaString = zinc~x+y, nmax = 40, countMax = 200) # add 20 more points with auxiliary variables (SSA method): optimRK <- ssaOptim(meuse, meuse.grid, candidates = candidates, covariates = "over", nDiff = 20, action = "add", model = vfitRK, nr_iterations = 10000, formulaString = zinc~dist+ffreq+soil, nmax = 40, countMax = 200)
# load data: library(gstat) data(meuse) coordinates(meuse) = ~x+y data(meuse.grid) coordinates(meuse.grid) = ~x+y gridded(meuse.grid) = TRUE predGrid = meuse.grid # estimate variograms (OK/UK): vfitOK = fit.variogram(variogram(zinc~1, meuse), vgm(1, "Exp", 300, 1)) vfitUK = fit.variogram(variogram(zinc~x+y, meuse), vgm(1, "Exp", 300, 1)) vfitRK = fit.variogram(variogram(zinc~dist+ffreq+soil, meuse), vgm(1, "Exp", 300, 1)) # study area of interest: bb = bbox(predGrid) boun = SpatialPoints(data.frame(x=c(bb[1,1],bb[1,2],bb[1,2],bb[1,1],bb[1,1]), y=c(bb[2,1],bb[2,1],bb[2,2],bb[2,2],bb[2,1]))) Srl = Polygons(list(Polygon(boun)),ID = as.character(1)) candidates = SpatialPolygonsDataFrame(SpatialPolygons(list(Srl)), data = data.frame(ID=1)) # add 20 more points assuming OK model (SSA method): optimOK <- ssaOptim(meuse, meuse.grid, candidates = candidates, covariates = "over", nDiff = 20, action = "add", model = vfitOK, nr_iterations = 10000, formulaString = zinc~1, nmax = 40, countMax = 200) # add 20 more points assuming UK model (SSA method): optimUK <- ssaOptim(meuse, meuse.grid, candidates = candidates, covariates = "over", nDiff = 20, action = "add", model = vfitUK, nr_iterations = 10000, formulaString = zinc~x+y, nmax = 40, countMax = 200) # add 20 more points with auxiliary variables (SSA method): optimRK <- ssaOptim(meuse, meuse.grid, candidates = candidates, covariates = "over", nDiff = 20, action = "add", model = vfitRK, nr_iterations = 10000, formulaString = zinc~dist+ffreq+soil, nmax = 40, countMax = 200)