| Title: | Sampling Design and Estimation Methods for Natural Resource Management |
|---|---|
| Description: | Provides functions for probability and non-probability sampling design, sample selection, and population estimation tailored to natural resource management. Probability methods include simple random sampling, stratified sampling, systematic sampling, cluster sampling, and probability-proportional-to-size sampling. Non-probability methods include convenience, judgement-based, and quota sampling. Estimation functions cover means, totals, ratio estimators, regression estimators, and the unequal-probability estimator of Horvitz and Thompson (1952, <doi:10.2307/2280784>) for unequal-probability designs. Utilities support biomass, soil-loss, and carbon-stock estimation from field plots. Spatial extensions provide random, systematic, stratified, and raster-weighted sampling within geographic polygons using the 'sf' and 'terra' packages, with extraction of remote-sensing covariates at sample locations. Applications include forest inventory, soil erosion monitoring, watershed studies, and ecological field surveys. |
| Authors: | Sadikul Islam [aut, cre] (ORCID: <https://orcid.org/0000-0003-2924-7122>) |
| Maintainer: | Sadikul Islam <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.2.2 |
| Built: | 2026-05-22 09:04:05 UTC |
| Source: | https://github.com/cran/NRMSampling |
Provides functions for probability and non-probability sampling design, sample selection, and estimation tailored to natural resource management.
srs_sample, stratified_sample,
systematic_sample, cluster_sample,
pps_sample
convenience_sample, purposive_sample,
quota_sample
estimate_mean, estimate_total,
estimate_variance, estimate_se,
estimate_ci, ratio_estimator,
regression_estimator, ht_estimator,
ht_variance, stratified_estimator
biomass_estimate, soil_loss_estimate,
carbon_stock_estimate, plot_summary,
sampling_efficiency
to_sf_points, spatial_random_sample,
spatial_systematic_sample,
spatial_stratified_sample,
spatial_cluster_sample,
raster_stratified_sample,
raster_pps_sample,
extract_raster_values,
spatial_biomass_estimate,
plot_sampling, plot_sampling_gg
sample(), runif(), setNames(), and mean()
are base:: functions and must NOT be listed in
importFrom(stats, ...). They are always available without
any import declaration.
Maintainer: Sadikul Islam [email protected] (ORCID)
Cochran, W.G. (1977). Sampling Techniques, 3rd ed. Wiley. Lohr, S.L. (2022). Sampling: Design and Analysis, 3rd ed. CRC Press.
Estimates total standing biomass over a landscape or management unit from plot-level measurements.
biomass_estimate(df, biomass_var, area)biomass_estimate(df, biomass_var, area)
df |
A data frame of sampled plots. |
biomass_var |
Character. Name of the column containing plot-level biomass density (e.g., Mg/ha or kg/plot). |
area |
Numeric. Total area of the management unit (same units as
the denominator of |
A named list with elements:
mean_biomassMean biomass density across sampled plots.
total_biomassEstimated total biomass over the study area.
seStandard error of the mean biomass density.
nNumber of plots used.
Avery, T.E. and Burkhart, H.E. (2002). Forest Measurements, 5th ed. McGraw-Hill, New York.
data(sample_nrm) srs <- srs_sample(sample_nrm, n = 30) biomass_estimate(srs, biomass_var = "biomass", area = 1000)data(sample_nrm) srs <- srs_sample(sample_nrm, n = 30) biomass_estimate(srs, biomass_var = "biomass", area = 1000)
Converts a biomass estimate to carbon stock using a biomass-to-carbon conversion factor.
carbon_stock_estimate(df, biomass_var, area, carbon_fraction = 0.47)carbon_stock_estimate(df, biomass_var, area, carbon_fraction = 0.47)
df |
A data frame of sampled plots. |
biomass_var |
Character. Name of the biomass density column. |
area |
Numeric. Total study area. |
carbon_fraction |
Numeric. Fraction of biomass that is carbon. Default 0.47 (IPCC default for tropical forests). |
A named list with elements total_biomass,
total_carbon, carbon_fraction, and n.
IPCC (2006). IPCC Guidelines for National Greenhouse Gas Inventories, Volume 4: Agriculture, Forestry and Other Land Use. IGES, Hayama, Japan.
data(sample_nrm) srs <- srs_sample(sample_nrm, n = 30) carbon_stock_estimate(srs, biomass_var = "biomass", area = 1000)data(sample_nrm) srs <- srs_sample(sample_nrm, n = 30) carbon_stock_estimate(srs, biomass_var = "biomass", area = 1000)
Performs single-stage cluster sampling.
cluster_sample(data, cluster_var, n_clusters)cluster_sample(data, cluster_var, n_clusters)
data |
A data frame. |
cluster_var |
Character. Cluster column. |
n_clusters |
Integer. Number of clusters. |
A data frame of sampled clusters.
Returns the first n rows of a data frame as a convenience sample.
This is the simplest non-probability method; results are generally not
representative of the population.
convenience_sample(data, n)convenience_sample(data, n)
data |
A data frame. |
n |
Integer. Number of units to select. |
Convenience sampling is fast but prone to selection bias. It may be appropriate for pilot studies or logistical constraints, but population inference requires strong assumptions. See Lohr (2022) for discussion.
A data frame containing the first n rows.
Lohr, S.L. (2022). Sampling: Design and Analysis, 3rd ed. CRC Press, Boca Raton, FL.
data(sample_nrm) cs <- convenience_sample(sample_nrm, n = 10) nrow(cs)data(sample_nrm) cs <- convenience_sample(sample_nrm, n = 10) nrow(cs)
Computes a confidence interval for the population mean based on a simple random sample, using the t-distribution.
estimate_ci(y, N = NULL, conf_level = 0.95)estimate_ci(y, N = NULL, conf_level = 0.95)
y |
Numeric vector. Sample observations. |
N |
Integer. Population size. Used for the finite-population
correction. Set to |
conf_level |
Numeric. Confidence level in (0, 1). Default 0.95. |
A named numeric vector with elements mean,
lower, and upper.
data(sample_nrm) srs <- srs_sample(sample_nrm, n = 30) estimate_ci(srs$biomass, N = nrow(sample_nrm))data(sample_nrm) srs <- srs_sample(sample_nrm, n = 30) estimate_ci(srs$biomass, N = nrow(sample_nrm))
Computes the sample mean of a numeric vector, ignoring missing values.
estimate_mean(y)estimate_mean(y)
y |
Numeric vector. Sample observations. |
Numeric scalar. The sample mean.
data(sample_nrm) srs <- srs_sample(sample_nrm, n = 30) estimate_mean(srs$biomass)data(sample_nrm) srs <- srs_sample(sample_nrm, n = 30) estimate_mean(srs$biomass)
Computes the estimated standard error of the sample mean under simple random sampling without replacement (SRSWOR).
estimate_se(y, N = NULL)estimate_se(y, N = NULL)
y |
Numeric vector. Sample observations. |
N |
Integer. Population size. If |
With fpc:
Without fpc:
Numeric scalar. Estimated standard error.
data(sample_nrm) srs <- srs_sample(sample_nrm, n = 30) estimate_se(srs$biomass, N = nrow(sample_nrm))data(sample_nrm) srs <- srs_sample(sample_nrm, n = 30) estimate_se(srs$biomass, N = nrow(sample_nrm))
Estimates the population total by expanding the sample mean to the full population size.
estimate_total(y, N)estimate_total(y, N)
y |
Numeric vector. Sample observations. |
N |
Integer. Known population size (number of units). |
Numeric scalar. Estimated population total.
Cochran, W.G. (1977). Sampling Techniques, 3rd ed. John Wiley & Sons, New York.
data(sample_nrm) srs <- srs_sample(sample_nrm, n = 30) estimate_total(srs$biomass, N = nrow(sample_nrm))data(sample_nrm) srs <- srs_sample(sample_nrm, n = 30) estimate_total(srs$biomass, N = nrow(sample_nrm))
Computes the unbiased sample variance of a numeric vector.
estimate_variance(y)estimate_variance(y)
y |
Numeric vector. Sample observations. |
Numeric scalar. The unbiased sample variance .
data(sample_nrm) estimate_variance(sample_nrm$biomass)data(sample_nrm) estimate_variance(sample_nrm$biomass)
Extracts cell values from a SpatRaster at the locations of
sf point features, returning the results as a data frame.
extract_raster_values(raster, points_sf)extract_raster_values(raster, points_sf)
raster |
A |
points_sf |
An |
Requires both the terra and sf packages. The CRS of
points_sf is reprojected to match raster if needed.
A data frame with one row per point and one column per raster layer.
if (requireNamespace("terra", quietly = TRUE) && requireNamespace("sf", quietly = TRUE)) { r <- terra::rast(nrows=20, ncols=20, vals=runif(400)) bbox <- sf::st_as_sfc(sf::st_bbox(terra::ext(r))) pts <- spatial_random_sample(bbox, n = 10) sf::st_crs(pts) <- sf::st_crs(terra::crs(r)) extract_raster_values(r, pts) }if (requireNamespace("terra", quietly = TRUE) && requireNamespace("sf", quietly = TRUE)) { r <- terra::rast(nrows=20, ncols=20, vals=runif(400)) bbox <- sf::st_as_sfc(sf::st_bbox(terra::ext(r))) pts <- spatial_random_sample(bbox, n = 10) sf::st_crs(pts) <- sf::st_crs(terra::crs(r)) extract_raster_values(r, pts) }
Provides a design-unbiased estimate of the population total for unequal-probability sampling designs.
ht_estimator(y, pi)ht_estimator(y, pi)
y |
Numeric vector. Sample values of the study variable. |
pi |
Numeric vector. First-order inclusion probabilities
(same length as |
Numeric scalar. HT estimate of the population total.
Horvitz, D.G. and Thompson, D.J. (1952). A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association, 47(260), 663–685.
data(sample_nrm) pps <- pps_sample(sample_nrm, size_var = "size", n = 20) pi_i <- pps$.inclusion_prob ht_estimator(pps$biomass, pi = pi_i)data(sample_nrm) pps <- pps_sample(sample_nrm, size_var = "size", n = 20) pi_i <- pps$.inclusion_prob ht_estimator(pps$biomass, pi = pi_i)
Estimates the variance of the Horvitz-Thompson total estimator using the Sen-Yates-Grundy approximation for with-replacement PPS designs.
ht_variance(y, pi)ht_variance(y, pi)
y |
Numeric vector. Sample values of the study variable. |
pi |
Numeric vector. First-order inclusion probabilities. |
Numeric scalar. Estimated variance of the HT total.
Yates, F. and Grundy, P.M. (1953). Selection without replacement from within strata with probability proportional to size. Journal of the Royal Statistical Society B, 15, 253–261.
data(sample_nrm) pps <- pps_sample(sample_nrm, size_var = "size", n = 20) ht_variance(pps$biomass, pps$.inclusion_prob)data(sample_nrm) pps <- pps_sample(sample_nrm, size_var = "size", n = 20) ht_variance(pps$biomass, pps$.inclusion_prob)
Produces a simple plot of sample point locations using base graphics.
plot_sampling( sf_points, col = "steelblue", pch = 16, main = "Sample Locations" )plot_sampling( sf_points, col = "steelblue", pch = 16, main = "Sample Locations" )
sf_points |
An |
col |
Character. Point colour. Default |
pch |
Integer. Point character. Default |
main |
Character. Plot title. Default |
Requires the sf package.
NULL invisibly. Called for its side effect (a plot).
if (requireNamespace("sf", quietly = TRUE)) { bbox <- sf::st_as_sfc(sf::st_bbox(c(xmin=77, xmax=78, ymin=30, ymax=31), crs = sf::st_crs(4326))) pts <- spatial_random_sample(bbox, n = 20) plot_sampling(pts) }if (requireNamespace("sf", quietly = TRUE)) { bbox <- sf::st_as_sfc(sf::st_bbox(c(xmin=77, xmax=78, ymin=30, ymax=31), crs = sf::st_crs(4326))) pts <- spatial_random_sample(bbox, n = 20) plot_sampling(pts) }
Produces a publication-quality map of sample point locations using ggplot2.
plot_sampling_gg( sf_points, colour = "tomato", size = 2, title = "Spatial Sample Locations" )plot_sampling_gg( sf_points, colour = "tomato", size = 2, title = "Spatial Sample Locations" )
sf_points |
An |
colour |
Character. Point colour. Default |
size |
Numeric. Point size. Default |
title |
Character. Plot title. |
Requires the sf and ggplot2 packages.
A ggplot object.
if (requireNamespace("sf", quietly = TRUE) && requireNamespace("ggplot2", quietly = TRUE)) { bbox <- sf::st_as_sfc(sf::st_bbox(c(xmin=77, xmax=78, ymin=30, ymax=31), crs = sf::st_crs(4326))) pts <- spatial_random_sample(bbox, n = 25) plot_sampling_gg(pts) }if (requireNamespace("sf", quietly = TRUE) && requireNamespace("ggplot2", quietly = TRUE)) { bbox <- sf::st_as_sfc(sf::st_bbox(c(xmin=77, xmax=78, ymin=30, ymax=31), crs = sf::st_crs(4326))) pts <- spatial_random_sample(bbox, n = 25) plot_sampling_gg(pts) }
Returns a concise summary table of key variables from a sample, including means, standard deviations, and sample sizes.
plot_summary(df, vars = NULL)plot_summary(df, vars = NULL)
df |
A data frame (sample or population). |
vars |
Character vector. Names of numeric columns to summarise.
If |
A data frame with columns variable, n,
mean, sd, min, and max.
data(sample_nrm) srs <- srs_sample(sample_nrm, n = 30) plot_summary(srs, vars = c("biomass", "soil_loss"))data(sample_nrm) srs <- srs_sample(sample_nrm, n = 30) plot_summary(srs, vars = c("biomass", "soil_loss"))
Performs probability proportional to size sampling.
pps_sample(data, size_var, n)pps_sample(data, size_var, n)
data |
A data frame. |
size_var |
Character. Size variable. |
n |
Integer. Sample size. |
A data frame with inclusion probabilities.
Selects units from a data frame that satisfy a logical condition supplied as a character string. This is a non-probability method in which units are selected based on expert judgement or predetermined criteria.
purposive_sample(data, condition)purposive_sample(data, condition)
data |
A data frame. |
condition |
Character string. A valid R logical expression
referring to column names in |
The condition is parsed and evaluated in the context of
data using with(). Only columns present in data
may be referenced.
A data frame of rows satisfying condition.
data(sample_nrm) # Select high-biomass forest plots ps <- purposive_sample(sample_nrm, condition = "biomass > 30 & strata == 'forest'") nrow(ps)data(sample_nrm) # Select high-biomass forest plots ps <- purposive_sample(sample_nrm, condition = "biomass > 30 & strata == 'forest'") nrow(ps)
Selects a fixed number of units from the top of each stratum. This non-probability method resembles stratified sampling but does not use random selection within strata.
quota_sample(data, strata_var, quota)quota_sample(data, strata_var, quota)
data |
A data frame. |
strata_var |
Character. Name of the column defining quota groups. |
quota |
Integer or named integer vector.
|
A data frame containing the selected rows from each stratum.
Lohr, S.L. (2022). Sampling: Design and Analysis, 3rd ed. CRC Press, Boca Raton, FL.
data(sample_nrm) qs <- quota_sample(sample_nrm, strata_var = "strata", quota = 5) table(qs$strata) # Variable quotas q <- c(forest = 6, agriculture = 3, grassland = 4) qs2 <- quota_sample(sample_nrm, "strata", quota = q)data(sample_nrm) qs <- quota_sample(sample_nrm, strata_var = "strata", quota = 5) table(qs$strata) # Variable quotas q <- c(forest = 6, agriculture = 3, grassland = 4) qs2 <- quota_sample(sample_nrm, "strata", quota = q)
Selects sample points from a SpatRaster with probability
proportional to cell values (e.g., vegetation density, erosion risk).
raster_pps_sample(raster, n)raster_pps_sample(raster, n)
raster |
A |
n |
Integer. Number of sample points. |
Requires both the terra and sf packages.
Identical in behaviour to raster_stratified_sample;
exposed as a separate function to match PPS nomenclature.
An sf POINT object.
if (requireNamespace("terra", quietly = TRUE) && requireNamespace("sf", quietly = TRUE)) { r <- terra::rast(nrows = 20, ncols = 20, vals = runif(400, 0, 1)) pts <- raster_pps_sample(r, n = 10) }if (requireNamespace("terra", quietly = TRUE) && requireNamespace("sf", quietly = TRUE)) { r <- terra::rast(nrows = 20, ncols = 20, vals = runif(400, 0, 1)) pts <- raster_pps_sample(r, n = 10) }
Samples a specified number of cells from a SpatRaster with
probability proportional to cell values, and returns their coordinates
as an sf point object.
raster_stratified_sample(raster, n)raster_stratified_sample(raster, n)
raster |
A |
n |
Integer. Number of sample points. |
Requires both the terra and sf packages. Cells with
NA values are excluded from sampling.
An sf POINT object with CRS taken from raster.
if (requireNamespace("terra", quietly = TRUE) && requireNamespace("sf", quietly = TRUE)) { r <- terra::rast(nrows = 20, ncols = 20, vals = runif(400, 1, 100)) pts <- raster_stratified_sample(r, n = 15) plot(sf::st_geometry(pts)) }if (requireNamespace("terra", quietly = TRUE) && requireNamespace("sf", quietly = TRUE)) { r <- terra::rast(nrows = 20, ncols = 20, vals = runif(400, 1, 100)) pts <- raster_stratified_sample(r, n = 15) plot(sf::st_geometry(pts)) }
Estimates the population total or mean of a study variable
using a correlated auxiliary variable with a known population
total .
ratio_estimator(y, x, X_total)ratio_estimator(y, x, X_total)
y |
Numeric vector. Sample values of the study variable. |
x |
Numeric vector. Sample values of the auxiliary variable
(same length as |
X_total |
Numeric. Known population total of the auxiliary variable. |
Numeric scalar. Ratio estimate of the population total of
.
Cochran, W.G. (1977). Sampling Techniques, 3rd ed. John Wiley & Sons, New York.
data(sample_nrm) srs <- srs_sample(sample_nrm, n = 30) X_total <- sum(sample_nrm$size) ratio_estimator(y = srs$biomass, x = srs$size, X_total = X_total)data(sample_nrm) srs <- srs_sample(sample_nrm, n = 30) X_total <- sum(sample_nrm$size) ratio_estimator(y = srs$biomass, x = srs$size, X_total = X_total)
Provides a model-assisted estimate of the population mean of
using a known population mean of the auxiliary variable .
regression_estimator(y, x, X_mean)regression_estimator(y, x, X_mean)
y |
Numeric vector. Sample values of the study variable. |
x |
Numeric vector. Sample values of the auxiliary variable. |
X_mean |
Numeric. Known population mean of |
where is the ordinary least-squares slope from
regressing on .
Numeric scalar. Regression estimate of the population mean.
Sarndal, C.E., Swensson, B., and Wretman, J. (2003). Model Assisted Survey Sampling. Springer, New York.
data(sample_nrm) srs <- srs_sample(sample_nrm, n = 30) X_mean <- mean(sample_nrm$size) regression_estimator(y = srs$biomass, x = srs$size, X_mean = X_mean)data(sample_nrm) srs <- srs_sample(sample_nrm, n = 30) X_mean <- mean(sample_nrm$size) regression_estimator(y = srs$biomass, x = srs$size, X_mean = X_mean)
A synthetic dataset of 100 field plots representing a heterogeneous natural resource management landscape with three strata and ten spatial clusters.
A data frame with 100 rows and 7 variables:
plot_idInteger. Unique plot identifier (1–100).
biomassNumeric. Aboveground biomass density (Mg/ha), drawn from Uniform(5, 50).
soil_lossNumeric. Annual soil loss (Mg/ha/yr), drawn from Uniform(0.1, 10).
strataCharacter. Land-use stratum: one of
"forest", "agriculture", or "grassland".
clusterInteger. Spatial cluster identifier (1–10).
sizeNumeric. Plot size measure used for PPS sampling (e.g., stand basal area), drawn from Uniform(1, 100).
carbonNumeric. Estimated carbon stock (Mg C/ha),
derived as 0.47 * biomass.
Synthetic data generated in data-raw/generate_datasets.R.
data(sample_nrm) head(sample_nrm) table(sample_nrm$strata) summary(sample_nrm[, c("biomass", "soil_loss")])data(sample_nrm) head(sample_nrm) table(sample_nrm$strata) summary(sample_nrm[, c("biomass", "soil_loss")])
A synthetic dataset of 100 geo-referenced field observations within a one-degree tile (77–78 degrees E, 30–31 degrees N) representing a Himalayan watershed zone.
A data frame with 100 rows and 8 variables:
idInteger. Observation identifier.
lonNumeric. Longitude (decimal degrees, WGS 84).
latNumeric. Latitude (decimal degrees, WGS 84).
biomassNumeric. Aboveground biomass (Mg/ha).
soil_lossNumeric. Annual soil loss (Mg/ha/yr).
strataCharacter. Land-use class: "forest" or
"agriculture".
clusterInteger. Spatial cluster (1–5).
ndviNumeric. Synthetic NDVI value (0–1), correlated with biomass.
Synthetic data generated in data-raw/generate_datasets.R.
data(sample_spatial) head(sample_spatial) if (requireNamespace("sf", quietly = TRUE)) { pts <- to_sf_points(sample_spatial, lon = "lon", lat = "lat") plot(sf::st_geometry(pts)) }data(sample_spatial) head(sample_spatial) if (requireNamespace("sf", quietly = TRUE)) { pts <- to_sf_points(sample_spatial, lon = "lon", lat = "lat") plot(sf::st_geometry(pts)) }
Computes the relative efficiency (RE) of two sampling designs by comparing their estimated variances of the mean. RE > 1 indicates Design 2 is more efficient than Design 1.
sampling_efficiency(y1, y2, N = NULL)sampling_efficiency(y1, y2, N = NULL)
y1 |
Numeric vector. Sample from Design 1. |
y2 |
Numeric vector. Sample from Design 2. |
N |
Integer. Population size (for fpc). Set |
A named numeric vector with var_design1,
var_design2, and relative_efficiency
(var1 / var2).
data(sample_nrm) srs1 <- srs_sample(sample_nrm, n = 20) srs2 <- srs_sample(sample_nrm, n = 30) sampling_efficiency(srs1$biomass, srs2$biomass, N = 100)data(sample_nrm) srs1 <- srs_sample(sample_nrm, n = 20) srs2 <- srs_sample(sample_nrm, n = 30) sampling_efficiency(srs1$biomass, srs2$biomass, N = 100)
Estimates mean and total soil loss from a set of erosion measurement plots, with a standard error.
soil_loss_estimate(df, loss_var, area)soil_loss_estimate(df, loss_var, area)
df |
A data frame of sampled erosion plots. |
loss_var |
Character. Name of the column containing plot-level soil loss measurements (e.g., Mg/ha/year). |
area |
Numeric. Total catchment or management area. |
A named list with elements mean_loss,
total_loss, se, and n.
Wischmeier, W.H. and Smith, D.D. (1978). Predicting Rainfall Erosion Losses. USDA Agriculture Handbook 537.
data(sample_nrm) srs <- srs_sample(sample_nrm, n = 25) soil_loss_estimate(srs, loss_var = "soil_loss", area = 500)data(sample_nrm) srs <- srs_sample(sample_nrm, n = 25) soil_loss_estimate(srs, loss_var = "soil_loss", area = 500)
Estimates total biomass over a study area from field measurements
stored in an sf point object.
spatial_biomass_estimate(sf_data, biomass_var, area)spatial_biomass_estimate(sf_data, biomass_var, area)
sf_data |
An |
biomass_var |
Character. Name of the biomass density column. |
area |
Numeric. Total study area (in consistent units). |
Requires the sf package.
A named list with mean_biomass, total_biomass,
se, and n.
if (requireNamespace("sf", quietly = TRUE)) { data(sample_spatial) pts_sf <- to_sf_points(sample_spatial, lon = "lon", lat = "lat") spatial_biomass_estimate(pts_sf, biomass_var = "biomass", area = 1000) }if (requireNamespace("sf", quietly = TRUE)) { data(sample_spatial) pts_sf <- to_sf_points(sample_spatial, lon = "lon", lat = "lat") spatial_biomass_estimate(pts_sf, biomass_var = "biomass", area = 1000) }
Randomly selects a number of spatial clusters (e.g., sub-watersheds, administrative units) and returns all features within selected clusters.
spatial_cluster_sample(sf_data, cluster_var, n_clusters)spatial_cluster_sample(sf_data, cluster_var, n_clusters)
sf_data |
An |
cluster_var |
Character. Name of the column identifying clusters. |
n_clusters |
Integer. Number of clusters to select. |
Requires the sf package.
An sf object containing features from the selected clusters.
if (requireNamespace("sf", quietly = TRUE)) { data(sample_spatial) pts_sf <- to_sf_points(sample_spatial, lon = "lon", lat = "lat") cl_sp <- spatial_cluster_sample(pts_sf, "cluster", n_clusters = 3) length(unique(cl_sp$cluster)) }if (requireNamespace("sf", quietly = TRUE)) { data(sample_spatial) pts_sf <- to_sf_points(sample_spatial, lon = "lon", lat = "lat") cl_sp <- spatial_cluster_sample(pts_sf, "cluster", n_clusters = 3) length(unique(cl_sp$cluster)) }
Draws a random sample of points uniformly distributed within an
sf polygon or multipolygon geometry.
spatial_random_sample(polygon, n)spatial_random_sample(polygon, n)
polygon |
An |
n |
Integer. Number of random points to generate. |
Requires the sf package. Points are drawn using
sf::st_sample(..., type = "random").
An sf object of POINT geometries within polygon.
if (requireNamespace("sf", quietly = TRUE)) { # Create a simple rectangular polygon for illustration bbox <- sf::st_as_sfc(sf::st_bbox(c(xmin=77, xmax=78, ymin=30, ymax=31), crs = sf::st_crs(4326))) pts <- spatial_random_sample(bbox, n = 20) plot(sf::st_geometry(pts)) }if (requireNamespace("sf", quietly = TRUE)) { # Create a simple rectangular polygon for illustration bbox <- sf::st_as_sfc(sf::st_bbox(c(xmin=77, xmax=78, ymin=30, ymax=31), crs = sf::st_crs(4326))) pts <- spatial_random_sample(bbox, n = 20) plot(sf::st_geometry(pts)) }
Selects a random sample of n_per_stratum features from each
stratum of an sf object.
spatial_stratified_sample(sf_data, strata_var, n_per_stratum)spatial_stratified_sample(sf_data, strata_var, n_per_stratum)
sf_data |
An |
strata_var |
Character. Name of the column defining strata. |
n_per_stratum |
Integer or named integer vector. Sample size per
stratum (see |
Requires the sf package.
An sf object with the selected features.
if (requireNamespace("sf", quietly = TRUE)) { data(sample_spatial) pts_sf <- to_sf_points(sample_spatial, lon = "lon", lat = "lat") st_sp <- spatial_stratified_sample(pts_sf, "strata", n_per_stratum = 5) table(st_sp$strata) }if (requireNamespace("sf", quietly = TRUE)) { data(sample_spatial) pts_sf <- to_sf_points(sample_spatial, lon = "lon", lat = "lat") st_sp <- spatial_stratified_sample(pts_sf, "strata", n_per_stratum = 5) table(st_sp$strata) }
Generates a systematic grid of points at a specified spacing within an
sf polygon, retaining only points that fall inside the boundary.
spatial_systematic_sample(polygon, spacing)spatial_systematic_sample(polygon, spacing)
polygon |
An |
spacing |
Numeric. Grid cell size in the units of the CRS (degrees for EPSG:4326, metres for projected CRS). |
Requires the sf package.
An sf object of POINT geometries inside polygon.
if (requireNamespace("sf", quietly = TRUE)) { bbox <- sf::st_as_sfc(sf::st_bbox(c(xmin=77, xmax=78, ymin=30, ymax=31), crs = sf::st_crs(4326))) grid_pts <- spatial_systematic_sample(bbox, spacing = 0.1) plot(sf::st_geometry(grid_pts)) }if (requireNamespace("sf", quietly = TRUE)) { bbox <- sf::st_as_sfc(sf::st_bbox(c(xmin=77, xmax=78, ymin=30, ymax=31), crs = sf::st_crs(4326))) grid_pts <- spatial_systematic_sample(bbox, spacing = 0.1) plot(sf::st_geometry(grid_pts)) }
Draws a simple random sample from a data frame.
srs_sample(data, n, replace = FALSE)srs_sample(data, n, replace = FALSE)
data |
A data frame representing the population. |
n |
Integer. Number of units to sample. |
replace |
Logical. Sample with replacement? Default FALSE. |
A data frame with sampled rows and a ".sample_id" column.
Estimates the population mean from a stratified sample using stratum weights (proportional to stratum size).
stratified_estimator(y, strata, N_h)stratified_estimator(y, strata, N_h)
y |
Numeric vector. Sample values of the study variable. |
strata |
Character or factor vector. Stratum labels (same length
as |
N_h |
Named numeric vector. Population stratum sizes; names must
match unique values of |
Numeric scalar. Estimated population mean.
Cochran, W.G. (1977). Sampling Techniques, 3rd ed. John Wiley & Sons, New York.
data(sample_nrm) st <- stratified_sample(sample_nrm, "strata", n_per_stratum = 5) N_h <- table(sample_nrm$strata) stratified_estimator(st$biomass, st$strata, N_h)data(sample_nrm) st <- stratified_sample(sample_nrm, "strata", n_per_stratum = 5) N_h <- table(sample_nrm$strata) stratified_estimator(st$biomass, st$strata, N_h)
Performs stratified random sampling.
stratified_sample(data, strata_var, n_per_stratum, replace = FALSE)stratified_sample(data, strata_var, n_per_stratum, replace = FALSE)
data |
A data frame. |
strata_var |
Character. Column defining strata. |
n_per_stratum |
Integer or named vector. |
replace |
Logical. Sample with replacement? |
A data frame with sampled rows.
Performs systematic sampling using interval k.
systematic_sample(data, k)systematic_sample(data, k)
data |
A data frame. |
k |
Integer. Sampling interval. |
A data frame of sampled rows.
Creates an sf simple-features point object from longitude and
latitude columns in a data frame.
to_sf_points(data, lon, lat, crs = 4326)to_sf_points(data, lon, lat, crs = 4326)
data |
A data frame containing coordinate columns. |
lon |
Character. Name of the longitude column. |
lat |
Character. Name of the latitude column. |
crs |
Integer or character. Coordinate reference system as an EPSG
code or PROJ string. Default |
Requires the sf package.
An sf object with a POINT geometry column.
if (requireNamespace("sf", quietly = TRUE)) { data(sample_spatial) pts <- to_sf_points(sample_spatial, lon = "lon", lat = "lat") print(pts) }if (requireNamespace("sf", quietly = TRUE)) { data(sample_spatial) pts <- to_sf_points(sample_spatial, lon = "lon", lat = "lat") print(pts) }