| Title: | Convert Camera Trap Dataset to 'Camtrap DP' |
|---|---|
| Description: | Builds Camera Trap Data Packages ('Camtrap DP') from arbitrary spreadsheets in a schema-driven way: table structure, types, constraints and relations are read from the 'Frictionless' table schemas of the requested 'Camtrap DP' version, so any version and custom columns are handled automatically. Provides validation against the schemas and an optional bridge to the 'frictionless' 'Python' framework. The 'Camtrap DP' standard is described in Bubnicki et al. (2023) <doi:10.1002/rse2.374>. |
| Authors: | Kana Terayama [aut, cre] (@KanaTerayama), Keita Fukasawa [aut] (@kfukasawa37) |
| Maintainer: | Kana Terayama <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 2.0.0 |
| Built: | 2026-06-29 14:55:48 UTC |
| Source: | https://github.com/cran/R2camtrapdp |
Example deployment notebook for an acoustic (audio) survey, used in the acoustic vignette. One row per device deployment. Coordinates are random points within the inland Izu Peninsula (Shizuoka, Japan).
AdepAdep
A data frame with 2 rows and 14 variables:
Unique identifier of the deployment.
Longitude in decimal degrees (WGS84).
Latitude in decimal degrees (WGS84).
Identifier of the deployment location.
Deployment start date.
Deployment start time.
Deployment end date.
Deployment end time.
Identifier of the recording device.
Manufacturer and model of the recording device.
Sampling frequency of the recordings (Hz).
Bit depth of the recordings.
Number of audio channels.
Name or identifier of the person/organization that deployed the device.
Example observation notebook for an acoustic (audio) survey, used in the
acoustic vignette. One row per observation; the filename column is the
audio file from which the media table is derived.
AobsAobs
A data frame with 6 rows and 19 variables:
Institution code.
Collection code.
Observation identifier (within an event).
Identifier of the event the observation belongs to.
Identifier of the deployment.
Identifier of the deployment location.
Date the recording was made.
Time the recording started.
Name of the audio file (used to build media).
Duration of the recording file (seconds).
Recorded object category (animal, none, ...).
Taxonomic class of the observed organism.
Genus of the observed organism.
Species epithet of the observed organism.
Number of observed individuals; NA here (not
counted from audio).
Lower bound of the call frequency (Hz); approximate values from the bioacoustics literature for each species.
Upper bound of the call frequency (Hz); approximate values from the bioacoustics literature for each species.
Date and time at which the event started.
Date and time at which the event ended.
create_deployments creates a table of deployments.
create_deployments( deploymentID, latitude, longitude, deploymentStart = NULL, deploymentStart_date = NULL, deploymentStart_time = NULL, deploymentEnd = NULL, deploymentEnd_date = NULL, deploymentEnd_time = NULL, locationID = NULL, locationName = NULL, coordinateUncertainty = NULL, setupBy = NULL, cameraID = NULL, cameraModel = NULL, cameraDelay = NULL, cameraHeight = NULL, cameraDepth = NULL, cameraTilt = NULL, cameraHeading = NULL, detectionDistance = NULL, timestampIssues = NULL, baitUse = NULL, featureType = NULL, habitat = NULL, deploymentGroups = NULL, deploymentTags = NULL, deploymentComments = NULL, tz = "Asia/Tokyo" )create_deployments( deploymentID, latitude, longitude, deploymentStart = NULL, deploymentStart_date = NULL, deploymentStart_time = NULL, deploymentEnd = NULL, deploymentEnd_date = NULL, deploymentEnd_time = NULL, locationID = NULL, locationName = NULL, coordinateUncertainty = NULL, setupBy = NULL, cameraID = NULL, cameraModel = NULL, cameraDelay = NULL, cameraHeight = NULL, cameraDepth = NULL, cameraTilt = NULL, cameraHeading = NULL, detectionDistance = NULL, timestampIssues = NULL, baitUse = NULL, featureType = NULL, habitat = NULL, deploymentGroups = NULL, deploymentTags = NULL, deploymentComments = NULL, tz = "Asia/Tokyo" )
deploymentID |
Unique identifier of the deployment |
latitude |
Latitude of the deployment location in decimal degrees, using the WGS84 datum |
longitude |
Longitude of the deployment location in decimal degrees, using the WGS84 datum |
deploymentStart |
Date and time at which the deployment was started |
deploymentStart_date |
Date at which the deployment was started |
deploymentStart_time |
Time at which the deployment was started |
deploymentEnd |
Date and time at which the deployment was ended |
deploymentEnd_date |
Date at which the deployment was ended |
deploymentEnd_time |
Time at which the deployment was ended |
locationID |
Identifier of the deployment location |
locationName |
Name given to the deployment location |
coordinateUncertainty |
Horizontal distance from the given latitude and longitude describing the smallest circle containing the deployment location |
setupBy |
Name or identifier of the person or organization that deployed the camera |
cameraID |
Identifier of the camera used for the deployment |
cameraModel |
Manufacturer and model of the camera |
cameraDelay |
Predefined duration after detection when further activity is ignored |
cameraHeight |
Height at which the camera was deployed |
cameraDepth |
Depth at which the camera was deployed |
cameraTilt |
Angle at which the camera was deployed in the vertical plane |
cameraHeading |
Angle at which the camera was deployed in the horizontal plane |
detectionDistance |
Maximum distance at which the camera can reliably detect activity |
timestampIssues |
true if timestamps in the media resource for the deployment are known to have unsolvable issues |
baitUse |
true if bait was used for the deployment |
featureType |
Type of the feature associated with the deployment |
habitat |
Short characterization of the habitat at the deployment location |
deploymentGroups |
Deployment groups associated with the deployment |
deploymentTags |
Tags associated with the deployment |
deploymentComments |
Comments or notes about the deployment |
tz |
Deployment time zone |
A tibble of deployments.
create_deployments( deploymentID = c("A01", "A02"), latitude = c(35.1, 36.2), longitude = c(139.5, 140.1), deploymentStart_date = c("2023-04-01", "2023-04-02"), deploymentStart_time = c("09:00:00", "10:30:00"), deploymentEnd_date = c("2023-05-01", "2023-05-02"), deploymentEnd_time = c("09:00:00", "10:30:00"))create_deployments( deploymentID = c("A01", "A02"), latitude = c(35.1, 36.2), longitude = c(139.5, 140.1), deploymentStart_date = c("2023-04-01", "2023-04-02"), deploymentStart_time = c("09:00:00", "10:30:00"), deploymentEnd_date = c("2023-05-01", "2023-05-02"), deploymentEnd_time = c("09:00:00", "10:30:00"))
create_media creates a table of media.
create_media( mediaID, deploymentID, timestamp = NULL, timestamp_date = NULL, timestamp_time = NULL, filePath, filePublic, fileMediatype, fileName = NULL, captureMethod = NULL, exifData = NULL, favorite = NULL, mediaComments = NULL, tz = "Asia/Tokyo", omitduplicate = TRUE )create_media( mediaID, deploymentID, timestamp = NULL, timestamp_date = NULL, timestamp_time = NULL, filePath, filePublic, fileMediatype, fileName = NULL, captureMethod = NULL, exifData = NULL, favorite = NULL, mediaComments = NULL, tz = "Asia/Tokyo", omitduplicate = TRUE )
mediaID |
Unique identifier of the media file |
deploymentID |
Identifier of the deployment the media file belongs to |
timestamp |
Date and time at which the media file was recorded |
timestamp_date |
Date at which the media file was recorded |
timestamp_time |
Time at which the media file was recorded |
filePath |
URL or relative path to the media file, respectively for externally hosted files or files that are part of the package |
filePublic |
false if the media file is not publicly accessible |
fileMediatype |
Mediatype of the media file. Expressed as an IANA Media Type |
fileName |
Name of the media file |
captureMethod |
Method used to capture the media file |
exifData |
EXIF data of the media file |
favorite |
true if the media file is deemed of interest |
mediaComments |
Comments or notes about the media file |
tz |
Time zone of the media file was recorded |
omitduplicate |
true if duplicate exclusion |
A tibble of media.
create_media( mediaID = "m1", deploymentID = "A01", timestamp_date = "2023-04-01", timestamp_time = "09:05:00", filePath = "img/m1.jpg", filePublic = TRUE, fileMediatype = "image/jpeg")create_media( mediaID = "m1", deploymentID = "A01", timestamp_date = "2023-04-01", timestamp_time = "09:05:00", filePath = "img/m1.jpg", filePublic = TRUE, fileMediatype = "image/jpeg")
create_media creates a table of observations.
create_observations( observationID, deploymentID, mediaID = NULL, eventID = NULL, eventStart = NULL, eventStart_date = NULL, eventStart_time = NULL, eventEnd = NULL, eventEnd_date = NULL, eventEnd_time = NULL, observationLevel, observationType, cameraSetupType = NULL, scientificName = NULL, count = NULL, lifeStage = NULL, sex = NULL, behavior = NULL, individualID = NULL, individualPositionRadius = NULL, individualPositionAngle = NULL, individualSpeed = NULL, bboxX = NULL, bboxY = NULL, bboxWidth = NULL, bboxHeight = NULL, classificationMethod = NULL, classifiedBy = NULL, classificationTimestamp = NULL, classificationProbability = NULL, observationTags = NULL, observationComments = NULL, tz = "Asia/Tokyo", omitduplicate = TRUE )create_observations( observationID, deploymentID, mediaID = NULL, eventID = NULL, eventStart = NULL, eventStart_date = NULL, eventStart_time = NULL, eventEnd = NULL, eventEnd_date = NULL, eventEnd_time = NULL, observationLevel, observationType, cameraSetupType = NULL, scientificName = NULL, count = NULL, lifeStage = NULL, sex = NULL, behavior = NULL, individualID = NULL, individualPositionRadius = NULL, individualPositionAngle = NULL, individualSpeed = NULL, bboxX = NULL, bboxY = NULL, bboxWidth = NULL, bboxHeight = NULL, classificationMethod = NULL, classifiedBy = NULL, classificationTimestamp = NULL, classificationProbability = NULL, observationTags = NULL, observationComments = NULL, tz = "Asia/Tokyo", omitduplicate = TRUE )
observationID |
Unique identifier of the observation |
deploymentID |
Identifier of the deployment the observation belongs to |
mediaID |
Identifier of the media file that was classified |
eventID |
Identifier of the event the observation belongs to |
eventStart |
Date and time at which the event started |
eventStart_date |
Date at which the event started |
eventStart_time |
Time at which the event started |
eventEnd |
Date and time at which the event ended |
eventEnd_date |
Date at which the event ended |
eventEnd_time |
Time at which the event ended |
observationLevel |
Level at which the observation was classified |
observationType |
Type of the observation |
cameraSetupType |
Type of the camera setup action associated with the observation |
scientificName |
Scientific name of the observed individual |
count |
Number of observed individuals |
lifeStage |
Age class or life stage of the observed individual |
sex |
Sex of the observed individual |
behavior |
Dominant behavior of the observed individual |
individualID |
Identifier of the observed individual |
individualPositionRadius |
Distance from the camera to the observed individual identified by individualID |
individualPositionAngle |
Angular distance from the camera view centerline to the observed individual identified by individualID |
individualSpeed |
Average movement speed of the observed individual identified by individualID |
bboxX |
Horizontal position of the top-left corner of a bounding box |
bboxY |
Vertical position of the top-left corner of a bounding box |
bboxWidth |
Width of a bounding box |
bboxHeight |
Height of the bounding box |
classificationMethod |
Method used to classify the observation |
classifiedBy |
Name or identifier of the person or AI algorithm that classified the observation |
classificationTimestamp |
Date and time of the classification |
classificationProbability |
Degree of certainty of the classification |
observationTags |
Tags associated with the observation |
observationComments |
Comments or notes about the observation |
tz |
Time zone of observation |
omitduplicate |
true if duplicate exclusion |
A tibble of observations.
create_observations( observationID = "o1", deploymentID = "A01", mediaID = "m1", eventStart = "2023-04-01 09:05:00", eventEnd = "2023-04-01 09:05:00", observationLevel = "media", observationType = "animal", scientificName = "Vulpes vulpes", count = 1L)create_observations( observationID = "o1", deploymentID = "A01", mediaID = "m1", eventStart = "2023-04-01 09:05:00", eventEnd = "2023-04-01 09:05:00", observationLevel = "media", observationType = "animal", scientificName = "Vulpes vulpes", count = 1L)
Renames the columns of df from the source (spreadsheet) names to Camtrap DP
field names. Columns of df that are not mentioned in mapping are kept
unchanged, so columns that already use Camtrap DP field names pass through.
ctdp_apply_mapping(df, mapping, drop_unmapped = FALSE)ctdp_apply_mapping(df, mapping, drop_unmapped = FALSE)
df |
A data.frame / tibble of input data. |
mapping |
The mapping from source column names to target field names. Accepted forms:
|
drop_unmapped |
If |
A tibble with renamed columns.
raw <- data.frame(station = c("A01", "A02"), lat = c(35.1, 36.2)) ctdp_apply_mapping(raw, c(station = "deploymentID", lat = "latitude"))raw <- data.frame(station = c("A01", "A02"), lat = c(35.1, 36.2)) ctdp_apply_mapping(raw, c(station = "deploymentID", lat = "latitude"))
Row-bind several issue tables
ctdp_bind_issues(...)ctdp_bind_issues(...)
... |
Issue tables (tibbles), or |
A single combined issue tibble.
a <- ctdp_issues(source = "media", constraint = "required", severity = "error", message = "missing mediaID") ctdp_bind_issues(a, ctdp_no_issues())a <- ctdp_issues(source = "media", constraint = "required", severity = "error", message = "missing mediaID") ctdp_bind_issues(a, ctdp_no_issues())
Build and validate a table against a Table Schema
ctdp_build_table( schema, data, mapping = NULL, datetime_merges = NULL, tz = "Asia/Tokyo", source = NULL, coerce = TRUE, stop_on_error = FALSE )ctdp_build_table( schema, data, mapping = NULL, datetime_merges = NULL, tz = "Asia/Tokyo", source = NULL, coerce = TRUE, stop_on_error = FALSE )
schema |
A TableSchema object. |
data |
A data.frame / tibble of input data. |
mapping |
Optional column mapping; see |
datetime_merges |
Optional list of date/time merge specs. Each element
is a list with |
tz |
Time zone used for temporal coercion / merging. |
source |
Issue |
coerce |
If |
stop_on_error |
If |
A list with elements data (the coerced tibble) and issues (an
issue table from ctdp_issues()).
sch <- list(name = "deployments", fields = list( list(name = "deploymentID", type = "string", constraints = list(required = TRUE)), list(name = "latitude", type = "number")), primaryKey = "deploymentID") schema <- TableSchema$new("deployments", json = sch) built <- ctdp_build_table(schema, data.frame(deploymentID = "A01", latitude = 35.1)) built$datasch <- list(name = "deployments", fields = list( list(name = "deploymentID", type = "string", constraints = list(required = TRUE)), list(name = "latitude", type = "number")), primaryKey = "deploymentID") schema <- TableSchema$new("deployments", json = sch) built <- ctdp_build_table(schema, data.frame(deploymentID = "A01", latitude = 35.1)) built$data
Verifies the schema structure that Frictionless requires of any Table Schema,
independently of the data: non-empty fields, each field with a name and a
supported type, constraints that are valid for the field's type, unique
field names, and primary/foreign keys that reference defined fields.
ctdp_check_schema(x)ctdp_check_schema(x)
x |
A TableSchema or a parsed table-schema list. |
An issue table (see ctdp_issues()). Constraint codes:
schema-structure, schema-type, schema-constraint, schema-key.
sch <- list(name = "deployments", fields = list( list(name = "deploymentID", type = "string", constraints = list(required = TRUE)), list(name = "latitude", type = "number", constraints = list(minimum = -90, maximum = 90))), primaryKey = "deploymentID") ctdp_check_schema(sch)sch <- list(name = "deployments", fields = list( list(name = "deploymentID", type = "string", constraints = list(required = TRUE)), list(name = "latitude", type = "number", constraints = list(minimum = -90, maximum = 90))), primaryKey = "deploymentID") ctdp_check_schema(sch)
Did a validation pass (no errors)?
ctdp_is_valid(issues)ctdp_is_valid(issues)
issues |
An issue table. |
TRUE if there are no "error" severity rows.
ctdp_is_valid(ctdp_no_issues()) # TRUE bad <- ctdp_issues(source = "deployments", constraint = "required", severity = "error", message = "missing") ctdp_is_valid(bad) # FALSEctdp_is_valid(ctdp_no_issues()) # TRUE bad <- ctdp_issues(source = "deployments", constraint = "required", severity = "error", message = "missing") ctdp_is_valid(bad) # FALSE
Constructs a tibble of validation issues. All arguments are recycled to a
common length. Use ctdp_no_issues() for an empty table with the right
columns.
ctdp_issues( source, location_type = NA_character_, field = NA_character_, row = NA_integer_, constraint = NA_character_, severity = "error", message = NA_character_, engine = "R", value = NA_character_ )ctdp_issues( source, location_type = NA_character_, field = NA_character_, row = NA_integer_, constraint = NA_character_, severity = "error", message = NA_character_, engine = "R", value = NA_character_ )
source |
Character. Where the issue lives, e.g. |
location_type |
Character. One of |
field |
Character. Column / field name, or |
row |
Integer. 1-based data row number, or |
constraint |
Character. The violated rule, e.g. |
severity |
Character. |
message |
Character. Human-readable description. |
engine |
Character. |
value |
Character. The offending value(s), when known (e.g. the failing cell value, or the value resolved from the descriptor for a metadata error). |
A tibble with one row per issue.
ctdp_issues(source = "deployments", field = "latitude", constraint = "required", severity = "error", message = "latitude is missing")ctdp_issues(source = "deployments", field = "latitude", constraint = "required", severity = "error", message = "latitude is missing")
Generalises the original *_date / *_time merging behaviour. Produces a
character column formatted in the Camtrap DP datetime format
(%Y-%m-%dT%H:%M:%S%z) so it round-trips cleanly to CSV.
ctdp_merge_datetime( df, date_col, time_col, target, tz = "Asia/Tokyo", format = .ctdp_datetime_format(), remove = TRUE )ctdp_merge_datetime( df, date_col, time_col, target, tz = "Asia/Tokyo", format = .ctdp_datetime_format(), remove = TRUE )
df |
A data.frame / tibble. |
date_col |
Name of the date column. |
time_col |
Name of the time column. |
target |
Name of the datetime column to create. |
tz |
Time zone used to interpret the local date/time. |
format |
Output datetime format. |
remove |
If |
A tibble with the target datetime column added.
raw <- data.frame(d = c("2023-04-01", "2023-04-02"), t = c("09:00:00", "10:30:00")) ctdp_merge_datetime(raw, "d", "t", "deploymentStart")raw <- data.frame(d = c("2023-04-01", "2023-04-02"), t = c("09:00:00", "10:30:00")) ctdp_merge_datetime(raw, "d", "t", "deploymentStart")
An empty issue table
ctdp_no_issues()ctdp_no_issues()
A 0-row issue tibble.
ctdp_no_issues()ctdp_no_issues()
Accepts either a parsed list (from jsonlite::fromJSON) or a JSON string, supporting both Frictionless v4 and v5 report layouts.
ctdp_parse_frictionless(report, resource_paths = NULL, descriptor = NULL)ctdp_parse_frictionless(report, resource_paths = NULL, descriptor = NULL)
report |
A Frictionless report as a list or JSON string. |
resource_paths |
Optional named character vector mapping resource names
to file paths (e.g. |
descriptor |
Optional parsed |
An issue table (includes a value column with the offending value(s)
when known).
# A valid Frictionless report parses to an empty issue table: ctdp_parse_frictionless(list(valid = TRUE, tasks = list())) ## Not run: # Typically the report comes from validate_frictionless() / the Python validator: issues <- ctdp_validate_frictionless("path/to/package") ## End(Not run)# A valid Frictionless report parses to an empty issue table: ctdp_parse_frictionless(list(valid = TRUE, tasks = list())) ## Not run: # Typically the report comes from validate_frictionless() / the Python validator: issues <- ctdp_validate_frictionless("path/to/package") ## End(Not run)
List the external (URL) references declared by a Table Schema
ctdp_schema_references(x)ctdp_schema_references(x)
x |
A TableSchema, or a parsed table-schema list
( |
A tibble with columns resource, field (NA for schema-level),
key (the JSON key carrying the URL), category and url. Categories:
"semantic-mapping" (skos:*), "description-reference", "example",
"schema-ref" (the table schema's own URL).
sch <- list(name = "deployments", fields = list( list(name = "captureMethod", type = "string", "skos:exactMatch" = "http://rs.tdwg.org/dwc/terms/samplingProtocol"))) ctdp_schema_references(sch)sch <- list(name = "deployments", fields = list( list(name = "captureMethod", type = "string", "skos:exactMatch" = "http://rs.tdwg.org/dwc/terms/samplingProtocol"))) ctdp_schema_references(sch)
Reports fields that carry a semantic mapping (skos:*) or a reference URL in
their description but have no enforceable enum or pattern
constraint. For these fields Frictionless (and this package) can check the
type/format but not the controlled vocabulary, so the values should be
checked against the referenced authority manually.
ctdp_semantic_only_fields(x)ctdp_semantic_only_fields(x)
x |
A TableSchema or a parsed table-schema list. |
A tibble: resource, field, type, reason, urls.
sch <- list(name = "deployments", fields = list( list(name = "captureMethod", type = "string", "skos:exactMatch" = "http://rs.tdwg.org/dwc/terms/samplingProtocol"))) ctdp_semantic_only_fields(sch)sch <- list(name = "deployments", fields = list( list(name = "captureMethod", type = "string", "skos:exactMatch" = "http://rs.tdwg.org/dwc/terms/samplingProtocol"))) ctdp_semantic_only_fields(sch)
Prints a grouped summary (errors / warnings per source) followed by a detailed listing showing the offending file, field, row and message.
ctdp_summarize_validation(issues, max_detail = 50L)ctdp_summarize_validation(issues, max_detail = 50L)
issues |
An issue table from |
max_detail |
Maximum number of detail rows to print per source. |
The issue table, invisibly.
issues <- ctdp_issues(source = "deployments", field = "latitude", constraint = "minimum", severity = "error", message = "latitude below -90", value = "-100") ctdp_summarize_validation(issues)issues <- ctdp_issues(source = "deployments", field = "latitude", constraint = "minimum", severity = "error", message = "latitude below -90", value = "-100") ctdp_summarize_validation(issues)
Validates a data package that already exists on disk (e.g. created by another
tool or in a previous run) without writing or overwriting the
datapackage.json or the CSV files. This is the validate-only counterpart of
the R6_CamtrapDP$validate_frictionless() method, which (by default) rewrites
the package from the R object before validating.
ctdp_validate_frictionless( directory, python = "python", script = NULL, patch_profile = TRUE, summarize = TRUE )ctdp_validate_frictionless( directory, python = "python", script = NULL, patch_profile = TRUE, summarize = TRUE )
directory |
Directory containing |
python |
Path to the Python interpreter (with |
script |
Path to |
patch_profile |
If |
summarize |
Whether to print a summary. |
An issue table (see ctdp_issues()); engine "frictionless".
## Not run: # Validate an existing Camtrap DP on disk (needs Python with 'frictionless'): issues <- ctdp_validate_frictionless("path/to/camtrapdp", python = "python") ctdp_is_valid(issues) ## End(Not run)## Not run: # Validate an existing Camtrap DP on disk (needs Python with 'frictionless'): issues <- ctdp_validate_frictionless("path/to/camtrapdp", python = "python") ctdp_is_valid(issues) ## End(Not run)
ctdp_build_table() is the generic, schema-driven path from an
arbitrary input spreadsheet to a Camtrap DP table. It (1) applies an
optional column mapping, (2) optionally merges separate date/time columns
into datetime columns, (3) coerces columns to the schema types, and
(4) validates the result against the schema constraints. It works for any
Camtrap DP version and for custom columns, because everything is read from
the supplied TableSchema.
The R-side validation in this package normally trusts that the
supplied table schema is a well-formed Frictionless Table Schema. These
helpers add an R-side pre-check of that assumption – catching, before the
Python Frictionless step, the kinds of structural problems Frictionless
itself rejects (unsupported field type, a constraint that is not valid
for a field's type, primary/foreign keys that reference undefined fields,
etc.). The authoritative check remains R6_CamtrapDP's
validate_frictionless().
Helpers to run the Python frictionless validator on a written
data package and parse the report. ctdp_validate_frictionless() validates
an existing data package directory without writing or overwriting
anything (use this when you only want to validate a package that was created
elsewhere). The R6_CamtrapDP method validate_frictionless() reuses the
same internals but writes the package from the object first (its write
argument).
Map an arbitrary input spreadsheet onto Camtrap DP field names, and combine separate date / time columns into a single datetime column.
Camtrap DP schemas specify some information not through
machine-enforceable Frictionless constraints, but through URLs:
semantic mappings (skos:exactMatch / skos:broadMatch /
skos:narrowMatch to Darwin Core, Audubon Core, Dublin Core, ... terms),
reference URLs embedded in field descriptions (e.g. the IANA media-type
registry for fileMediatype, or method DOIs for individualSpeed), and
the resource schema / package profile URLs themselves.
These functions surface every such URL so that, when you adopt a new version or a new schema flavor, you do not overlook a specification that is expressed only by reference.
A uniform structure for validation issues, whether they are
produced by the R-side schema/relation checks or parsed from a Python
Frictionless validation report. An "issue table" is a tibble::tibble
with the columns described in ctdp_issues().
A pre-built Camtrap DP object (class camtrapdp, bioacoustics flavor) created
from Adep and Aobs, for trying out the acoustic workflow without
rebuilding from scratch. media is derived from the audio file names.
datapackageAdatadatapackageAdata
A camtrapdp object (a list with the package metadata and the
deployments / media / observations tables under $data).
datapackageVdata, datapackageIdata
A pre-built Camtrap DP object (class camtrapdp) created from Idep and
Iobs, for trying out the package without rebuilding from scratch.
datapackageIdatadatapackageIdata
A camtrapdp object (a list with the package metadata and the
deployments / media / observations tables under $data).
A pre-built Camtrap DP object (class camtrapdp) created from Vdep and
Vobs, for trying out the package without rebuilding from scratch.
datapackageVdatadatapackageVdata
A camtrapdp object (a list with the package metadata and the
deployments / media / observations tables under $data).
Originally data can be used from doi:10.34462/0002000233
A small example deployment table used in the package vignettes and examples. One row per camera deployment.
IdepIdep
A data frame with 10 rows and 14 variables:
Unique identifier of the deployment.
Longitude in decimal degrees (WGS84).
Latitude in decimal degrees (WGS84).
Identifier of the deployment location.
Deployment start date.
Deployment start time.
Deployment end date.
Deployment end time.
Identifier of the camera.
Manufacturer and model of the camera.
Predefined duration after detection during which further activity is ignored (camera delay).
Height at which the camera was deployed.
Whether bait was used for the deployment.
Name or identifier of the person/organization that deployed the camera.
A small example observation table used in the package vignettes and examples. One row per observation.
IobsIobs
A data frame with 388 rows and 17 variables:
Institution code.
Collection code.
Observation identifier (within an event).
Identifier of the event the observation belongs to.
Identifier of the deployment location.
Date the media file was recorded.
Time the media file was recorded.
Recorded object category (raw label).
Taxonomic class of the observed organism.
Genus of the observed organism.
Species epithet of the observed organism.
Number of observed individuals.
Identifier of the SD card.
Name of the media file.
Identifier of the deployment the observation belongs to.
Date and time at which the event started.
Date and time at which the event ended.
Frictionless validates the package data against the table schemas, but it
validates the metadata (the datapackage.json descriptor itself) against
the package profile – a JSON Schema (e.g. camtrap-dp-profile.json).
MetadataProfile reads that profile and extracts the machine-readable
metadata requirements: the required top-level properties
(contributors, project, spatial, temporal, taxonomic, ...), their
types / minItems, and the nested required keys of object properties
(e.g. project requires title, samplingDesign, ...).
This lets the package check / scaffold the required metadata structure on the R side, instead of only discovering metadata problems when Python Frictionless runs.
versionCamtrap DP version.
urlResolved profile URL (if loaded from a URL).
rawThe raw parsed profile (list).
requiredCharacter vector of required top-level property names.
propertiesNamed list of top-level property definitions.
MetadataProfile$new()Create a MetadataProfile.
MetadataProfile$new( version = "1.0.1", url = NULL, url_template = NULL, local_path = NULL, json = NULL, cache_dir = file.path(tempdir(), "camtrapdp-schemas"), use_cache = TRUE )
versionCamtrap DP version string.
urlA fully-resolved profile URL (takes precedence over template).
url_templateA URL containing <version>.
local_pathPath to a local profile JSON file.
jsonA pre-parsed profile list.
cache_dirDirectory used to cache downloaded profiles.
use_cacheReuse / store a cached copy.
MetadataProfile$property()Definition of a top-level property.
MetadataProfile$property(name)
nameProperty name.
MetadataProfile$property_type()type of a property.
MetadataProfile$property_type(name)
nameProperty name.
MetadataProfile$property_min_items()minItems of a property (or NULL).
MetadataProfile$property_min_items(name)
nameProperty name.
MetadataProfile$property_required()Nested required keys of an object property (e.g. project).
MetadataProfile$property_required(name)
nameProperty name.
MetadataProfile$item_required()Required keys of each item of an array property
(e.g. taxonomic items require scientificName).
MetadataProfile$item_required(name)
nameProperty name.
MetadataProfile$clone()The objects of this class are cloneable with this method.
MetadataProfile$clone(deep = FALSE)
deepWhether to make a deep clone.
## Not run: # Fetches the package profile for a Camtrap DP version (needs internet): prof <- MetadataProfile$new(version = "1.0.1") prof ## End(Not run)## Not run: # Fetches the package profile for a Camtrap DP version (needs internet): prof <- MetadataProfile$new(version = "1.0.1") prof ## End(Not run)
R6 class holding the Camtrap DP metadata, deployments, media and
observations. This is the schema-driven successor of the original
R6_CamtrapDP: the field names, types, constraints and relations used when
adding tables are read from the Frictionless Table Schemas for the configured
Camtrap DP version, so an arbitrary version and custom columns are handled
automatically.
The public field names and the method names of the original class are
preserved for backward compatibility (set_deployments(), set_media(),
set_observations(), set_custom(), add_contributors(), add_sources(),
add_license(), set_project(), set_st(), set_taxon(),
add_relatedIdentifiers(), add_references(), out_camtrapdp(),
import_metadata()). New, schema-driven capabilities are added as new
methods (get_schema(), add_table(), check_relations(), validate(),
validate_frictionless()).
resourcesis the package data resources
profileof the resource
nameIdentifier of the resource
idA property reserved for globally unique identifiers
createdThe datetime on which this Data Package was created
titleTitle of this Data Package
contributorsThe people or organizations who contributed
descriptionDescription of this Data Package
versionThe version of this Data Package
keywordsKeywords of this Data Package
imageA URL or Path of an image for this Data Package
homepageA URL for the home on the web related to this Data Package
sourcesA row sources for this Data Package
licensesThe licenses under which the Data Package is provided
bibliographicCitationA bibliographical reference for the resource
projectCamera trap project or study
coordinatePrecisionLeast precise coordinate precision
spatialSpatial coverage, expressed as GeoJSON
temporalTemporal coverage of this Data Package
taxonomicTaxonomic coverage of this Data Package
relatedIdentifiersIdentifiers of related resources
referencesList of references related to this Data Package
directoryDirectory of this Data Package
dataObservation, Media and Deployments tables
schema_urlsNamed list of <version> URL templates per resource.
schemasCache of loaded TableSchema objects, keyed by resource.
profile_schemaCached MetadataProfile for the package profile.
cache_dirDirectory used to cache downloaded schemas.
use_cacheWhether to cache / reuse downloaded schemas.
validationPer-table validation issues accumulated during build.
CamtrapDP$new()Creates a new instance.
CamtrapDP$new(tz = "Asia/Tokyo", ...)
tzTime zone.
...Passed to set_properties().
CamtrapDP$update_created()Updates the created timestamp.
CamtrapDP$update_created(tz = "Asia/Tokyo")
tzTime zone.
CamtrapDP$set_properties()Sets package properties.
CamtrapDP$set_properties(
directory = getwd(),
name = NULL,
id = NULL,
title = NULL,
description = NULL,
profile =
"https://raw.githubusercontent.com/tdwg/camtrap-dp/<version>/camtrap-dp-profile.json",
version = "1.0.1",
keywords = NULL,
image = NULL,
homepage = NULL,
bibliographicCitation = NULL,
coordinatePrecision = NULL,
schema_urls = NULL,
cache_dir = NULL,
use_cache = NULL
)
directoryDirectory of datapackage.
nameIdentifier of the resource.
idGlobally unique identifier.
titleTitle of this Data Package.
descriptionDescription of this Data Package.
profileProfile of the resource (<version> template).
versionThe Camtrap DP version.
keywordsKeywords.
imageImage URL/path.
homepageHomepage URL.
bibliographicCitationBibliographic reference.
coordinatePrecisionCoordinate precision.
schema_urlsOptional named list of <version> schema URL templates.
cache_dirOptional directory to cache schemas.
use_cacheWhether to cache / reuse downloaded schemas.
CamtrapDP$get_schema()Load (and cache) the TableSchema for a resource at the configured version.
CamtrapDP$get_schema( resource, schema_url = NULL, local_path = NULL, json = NULL, refresh = FALSE )
resourceResource name, e.g. "deployments".
schema_urlOptional <version> URL template (overrides the default).
local_pathOptional local schema file.
jsonOptional pre-parsed schema list.
refreshIf TRUE, reload even if cached on the object.
A TableSchema.
CamtrapDP$set_deployments()Sets deployments. Backward-compatible signature; when the schema is reachable the data is coerced and validated against it.
CamtrapDP$set_deployments( data, path = "deployments.csv", profile = "tabular-data-resource", format = "csv", mediatype = "text/csv", encoding = "utf-8", schema = NULL, mapping = NULL, datetime_merges = NULL, validate = TRUE, local_schema = NULL, tz = "Asia/Tokyo" )
dataDeployments dataset.
pathPath to the data file.
profileProfile of the resource.
formatFormat of the data.
mediatypeMedia type.
encodingEncoding.
schema<version> URL template for the table schema. If NULL
(default), the resource's configured template is used (the camera-trap
default, or whatever was set via set_properties(schema_urls=), e.g. a
bioacoustics flavor).
mappingOptional column mapping (see ctdp_apply_mapping()).
datetime_mergesOptional date/time merge specs.
validateWhether to validate against the schema.
local_schemaOptional local schema file path.
tzTime zone for temporal coercion.
CamtrapDP$set_media()Sets media. See set_deployments() for shared arguments.
CamtrapDP$set_media( data, path = "media.csv", profile = "tabular-data-resource", format = "csv", mediatype = "text/csv", encoding = "utf-8", schema = NULL, mapping = NULL, datetime_merges = NULL, validate = TRUE, local_schema = NULL, tz = "Asia/Tokyo" )
dataMedia dataset.
pathPath to the data file.
profileProfile of the resource.
formatFormat of the data.
mediatypeMedia type.
encodingEncoding.
schema<version> URL template for the table schema.
mappingOptional column mapping.
datetime_mergesOptional date/time merge specs.
validateWhether to validate against the schema.
local_schemaOptional local schema file path.
tzTime zone for temporal coercion.
CamtrapDP$set_observations()Sets observations. See set_deployments() for shared arguments.
CamtrapDP$set_observations( data, path = "observations.csv", profile = "tabular-data-resource", format = "csv", mediatype = "text/csv", encoding = "utf-8", schema = NULL, mapping = NULL, datetime_merges = NULL, validate = TRUE, local_schema = NULL, tz = "Asia/Tokyo" )
dataObservations dataset.
pathPath to the data file.
profileProfile of the resource.
formatFormat of the data.
mediatypeMedia type.
encodingEncoding.
schema<version> URL template for the table schema.
mappingOptional column mapping.
datetime_mergesOptional date/time merge specs.
validateWhether to validate against the schema.
local_schemaOptional local schema file path.
tzTime zone for temporal coercion.
CamtrapDP$add_table()Add an arbitrary (custom or standard) resource table,
schema-driven. Unlike set_custom() this loads a Table Schema, applies
mapping, coerces, validates, stores the table in $data[[name]] and
appends a proper tabular-data resource.
CamtrapDP$add_table( name, data, mapping = NULL, datetime_merges = NULL, schema_url = NULL, local_schema = NULL, schema_json = NULL, path = NULL, description = NULL, profile = "tabular-data-resource", format = "csv", mediatype = "text/csv", encoding = "utf-8", validate = TRUE, tz = "Asia/Tokyo" )
nameResource name.
dataDataset.
mappingOptional column mapping.
datetime_mergesOptional date/time merge specs.
schema_urlOptional <version> URL template for the schema.
local_schemaOptional local schema file path.
schema_jsonOptional pre-parsed schema list.
pathOutput CSV path (defaults to <name>.csv).
descriptionOptional resource description.
profileResource profile.
formatResource format.
mediatypeResource media type.
encodingResource encoding.
validateWhether to validate against the schema.
tzTime zone for temporal coercion.
CamtrapDP$set_custom()Sets a custom data resource (original behaviour preserved).
CamtrapDP$set_custom(name, description, data)
nameName of dataset.
descriptionDescription of dataset.
dataCustom dataset.
CamtrapDP$add_contributors()Adds contributors.
CamtrapDP$add_contributors(contrib_table)
contrib_tabledata frame of contributors (title, email, path, role, organization).
CamtrapDP$add_sources()Add a source.
CamtrapDP$add_sources(title, path = NULL, email = NULL, version = NULL)
titleTitle of source.
pathPath or URL to the source.
emailAn email address.
versionThe version of the source.
CamtrapDP$add_license()Add a license.
CamtrapDP$add_license(name, scope, path = NULL, title = NULL)
nameName of license.
scopeScope ("data" or "media").
pathURL/path to the license details.
titleTitle of license.
CamtrapDP$set_project()Sets the project.
CamtrapDP$set_project( title, samplingDesign, captureMethod, individualAnimals, observationLevel, id = NULL, acronym = NULL, description = NULL, path = NULL )
titleTitle of project.
samplingDesignSampling design.
captureMethodCapture method.
individualAnimalsLogical: individuals recognised?
observationLevelObservation level.
idProject id.
acronymProject acronym.
descriptionProject description.
pathProject website.
CamtrapDP$set_st()Sets spatial and temporal coverage from the deployments.
CamtrapDP$set_st()
CamtrapDP$set_taxon()Sets taxonomic coverage from the observations. The Camtrap DP
taxonomic block requires a taxonID (e.g. a GBIF / IUCN identifier or
URI), which is looked up with taxadb; taxadb is therefore a required
dependency of the package (loaded with it, as in previous versions).
Names that cannot be matched get taxonID = NA (omitted from the output
rather than a bogus <uri>NA). A warning() is emitted for
scientificName values with unnecessary whitespace and for names with no
taxonID in the chosen database.
CamtrapDP$set_taxon(taxonDB = "gbif")
taxonDBTaxon database passed to taxadb: "gbif" (default),
"itis" or "ncbi".
CamtrapDP$add_relatedIdentifiers()Adds a relatedIdentifier.
CamtrapDP$add_relatedIdentifiers( relationType, relatedIdentifier, relatedIdentifierType, resourceTypeGeneral = NULL )
relationTypeType of relation.
relatedIdentifierRelated identifier.
relatedIdentifierTypeType of related identifier.
resourceTypeGeneralGeneral type of the related resource.
CamtrapDP$add_references()Adds references.
CamtrapDP$add_references(reference)
referenceReference of data.
CamtrapDP$get_profile()Load (and cache) the MetadataProfile for the package
profile (the JSON Schema that Frictionless validates datapackage.json
against). Uses self$profile (set by set_properties()), so it follows
the configured version / flavor.
CamtrapDP$get_profile(refresh = FALSE)
refreshReload even if cached.
CamtrapDP$metadata_requirements()The metadata requirements derived from the package profile: which top-level properties are required, their type, nested required keys, the method that creates each, and whether it is currently set.
CamtrapDP$metadata_requirements()
A tibble (property, required, type, sub_required,
item_required, set_with, currently_set).
CamtrapDP$check_metadata()Validate the current metadata against the package profile's required structure (a fast R-side counterpart to the profile validation that Frictionless performs). Reports missing required top-level properties and missing nested/required item keys.
CamtrapDP$check_metadata(summarize = TRUE)
summarizePrint a summary.
An issue table (see ctdp_issues()).
CamtrapDP$check_descriptor()Pre-check that the package descriptor and its table schemas
conform to the Frictionless specification, before the authoritative
Python Frictionless step. Checks the package structure (non-empty
resources; each resource has name and path/data; unique names;
profile set) and the well-formedness of every loaded / inline table
schema (ctdp_check_schema()). Optionally runs a full JSON Schema
validation of the written descriptor when jsonschema is given and the
jsonvalidate package is installed.
CamtrapDP$check_descriptor( check_schemas = TRUE, jsonschema = NULL, summarize = TRUE )
check_schemasAlso check each table schema's well-formedness.
jsonschemaOptional path/URL to a JSON Schema to validate the
serialized descriptor against (requires the jsonvalidate package).
summarizePrint a summary.
An issue table (see ctdp_issues()).
CamtrapDP$check_camtrap_profile()Warn if the package profile is not a Camtrap DP profile.
A package can be a valid Frictionless data package yet not be Camtrap DP
form unless its profile is the Camtrap DP profile.
CamtrapDP$check_camtrap_profile(summarize = TRUE)
summarizePrint a summary.
An issue table (see ctdp_issues()).
CamtrapDP$check_relations()Check primary-key and foreign-key relations across the registered tables, driven by each table's Table Schema.
CamtrapDP$check_relations(summarize = TRUE)
summarizeIf TRUE, print a summary of any issues.
An issue table (see ctdp_issues()).
CamtrapDP$external_references()List every external (URL) reference declared across the
package: the package profile, each resource's schema URL (or the
references inside an inline schema), and the semantic / description-URL
references of every loaded table schema. Use this when adopting a new
version or schema flavor so that no URL-specified requirement is missed.
CamtrapDP$external_references()
A tibble (resource, field, key, category, url).
CamtrapDP$validate()Aggregate validation: per-table schema issues collected at build time plus cross-table relation checks. Optionally also runs the Python Frictionless validation.
CamtrapDP$validate( relations = TRUE, metadata = FALSE, conformance = FALSE, frictionless = FALSE, summarize = TRUE, ... )
relationsWhether to run check_relations().
metadataWhether to run check_metadata() (profile-driven).
conformanceWhether to run the Frictionless conformance pre-checks
check_descriptor() and check_camtrap_profile().
frictionlessWhether to also run validate_frictionless().
summarizeWhether to print a summary.
...Passed to validate_frictionless().
An issue table.
CamtrapDP$validate_frictionless()Run the Python Frictionless validation on the written data package and parse the report into an issue table.
CamtrapDP$validate_frictionless( directory = NULL, python = "python", script = NULL, write = TRUE, patch_profile = TRUE, summarize = TRUE )
directoryDirectory containing (or to receive) the data package. Defaults to a temporary directory; the package is written there first.
pythonPath to the Python interpreter.
scriptPath to frictionless_validate.py. If NULL, resolved
from getOption("camtrapdp.frictionless_script"), else the installed
copy (system.file("python/frictionless_validate.py", package = "R2camtrapdp")), else the loose-source python/frictionless_validate.py.
writeWhether to (re)write the data package before validating.
patch_profileIf TRUE, work around the malformed internal
$ref (#$defs/version) in the Camtrap DP 1.0 profile by validating
against a locally corrected copy. The written datapackage.json keeps
the canonical profile URL; only a separate validation descriptor is
patched. Has no effect for 1.0.1 / 1.0.2 (their profiles are correct).
summarizeWhether to print a summary.
An issue table (engine "frictionless").
CamtrapDP$out_camtrapdp()Exports the camtrapdp object and (optionally) writes the
data package to disk.
CamtrapDP$out_camtrapdp(write = FALSE, directory = NULL)
writeIf TRUE, write the data package to directory.
directoryOutput directory.
A camtrapdp object (list).
CamtrapDP$import_metadata()Imports metadata from a list.
CamtrapDP$import_metadata(metadata0)
metadata0List of metadata.
CamtrapDP$clone()The objects of this class are cloneable with this method.
CamtrapDP$clone(deep = FALSE)
deepWhether to make a deep clone.
# Create the builder (offline): dp <- R6_CamtrapDP$new(version = "1.0.1", title = "Example", description = "...") ## Not run: # Registering tables fetches the schema for the chosen version (needs internet): deployments <- create_deployments( deploymentID = "A01", latitude = 35.1, longitude = 139.5, deploymentStart_date = "2023-04-01", deploymentStart_time = "09:00:00", deploymentEnd_date = "2023-05-01", deploymentEnd_time = "09:00:00") dp$set_deployments(deployments) dp$check_relations() dp$out_camtrapdp(write = TRUE, directory = tempfile()) ## End(Not run)# Create the builder (offline): dp <- R6_CamtrapDP$new(version = "1.0.1", title = "Example", description = "...") ## Not run: # Registering tables fetches the schema for the chosen version (needs internet): deployments <- create_deployments( deploymentID = "A01", latitude = 35.1, longitude = 139.5, deploymentStart_date = "2023-04-01", deploymentStart_time = "09:00:00", deploymentEnd_date = "2023-05-01", deploymentEnd_time = "09:00:00") dp$set_deployments(deployments) dp$check_relations() dp$out_camtrapdp(write = TRUE, directory = tempfile()) ## End(Not run)
TableSchema loads a Frictionless Table Schema (such as the Camtrap DP
deployments / media / observations table schemas) from a URL, a local
file, or an in-memory list, and exposes everything needed to build and
validate a data table against it: field names and types, the primary key,
foreign keys, missing-value tokens, and per-field constraints
(required, unique, enum, minimum, maximum, minLength,
maxLength, pattern, type, format).
Because the structure is read from the schema itself, an arbitrary Camtrap DP version and any custom / extra columns are handled automatically: any field present in the supplied schema participates in validation.
resourceResource name, e.g. "deployments".
versionCamtrap DP version used to resolve the URL.
urlResolved schema URL (if loaded from a URL).
nameSchema name.
titleSchema title.
descriptionSchema description.
fieldsNamed list of field definitions keyed by field name.
field_orderCharacter vector of field names, in schema order.
primaryKeyCharacter vector naming the primary key field(s).
foreignKeysList of foreign-key definitions.
missingValuesCharacter vector of missing-value tokens.
rawThe raw parsed schema (list).
TableSchema$new()Create a TableSchema.
TableSchema$new( resource = NULL, version = "1.0.1", url_template = NULL, local_path = NULL, json = NULL, cache_dir = file.path(tempdir(), "camtrapdp-schemas"), use_cache = TRUE )
resourceResource name (used to pick a default URL template and to
label issues), e.g. "deployments".
versionCamtrap DP version string, e.g. "1.0.1".
url_templateURL containing the <version> placeholder. If NULL
and resource is one of the standard tables, a default template is used.
local_pathPath to a local schema JSON file. Takes precedence over the URL.
jsonA pre-parsed schema list. Takes precedence over local_path.
cache_dirDirectory used to cache downloaded schemas.
use_cacheIf TRUE, reuse a cached copy when present and cache new
downloads.
TableSchema$field_names()Field names in schema order.
TableSchema$field_names()
TableSchema$field()Get a single field definition by name.
TableSchema$field(name)
nameField name.
TableSchema$required_field_names()Names of fields whose constraints$required is TRUE.
TableSchema$required_field_names()
TableSchema$field_type()Type of a field ("string", "number", ...).
TableSchema$field_type(name)
nameField name.
TableSchema$requirements()A tidy summary of the schema's requirements: one row per
field with its type, format, and constraints (required, unique,
enum, minimum, maximum, pattern). Works for any version / flavor.
TableSchema$requirements()
A tibble.
TableSchema$empty_table()Create an empty (0-row) tibble shell with one correctly typed column per schema field, in schema order.
TableSchema$empty_table()
A tibble with 0 rows.
TableSchema$coerce()Coerce a data frame to the schema. The result contains
every schema field, in schema order: present columns are cast to
the schema type (date/datetime/time formatted as canonical strings) and
fields absent from the input are added as typed NA columns (Camtrap DP
CSVs are expected to carry all schema columns). Columns not present in
the schema are kept after the schema columns (custom columns) and a
warning is emitted listing them.
TableSchema$coerce(df, tz = "Asia/Tokyo", complete = TRUE)
dfA data.frame / tibble.
tzTime zone used when formatting date/datetime values supplied as
POSIXt / Date.
completeIf TRUE (default), include all schema fields, filling
absent ones with typed NA. If FALSE, keep only the supplied columns.
A tibble.
TableSchema$external_references()List the external (URL) references this schema declares
(semantic skos:* mappings, reference URLs in field descriptions, the
schema URL itself). See ctdp_schema_references().
TableSchema$external_references()
TableSchema$semantic_only_fields()List fields whose meaning is defined only by reference (a
semantic mapping or a description URL) and which therefore cannot be
fully validated against a controlled vocabulary. See
ctdp_semantic_only_fields().
TableSchema$semantic_only_fields()
TableSchema$check_schema()Check that this schema is a well-formed Frictionless Table
Schema (supported types, constraints valid for each type, keys reference
defined fields). See ctdp_check_schema().
TableSchema$check_schema()
TableSchema$validate()Validate a data frame against the schema constraints.
TableSchema$validate( df, source = paste0(self$resource %||% self$name, ".csv"), raw = NULL )
dfA data.frame / tibble (ideally already passed through
$coerce()).
sourceLabel used as the issue source (e.g. "deployments.csv").
rawOptional pre-coercion data (the original input passed to
$coerce()). When supplied, values that were present in raw but became
NA during coercion – i.e. type-invalid entries such as a non-numeric
string in a number field – are reported as type errors instead of
silently vanishing. ctdp_build_table() passes this automatically.
An issue table (see ctdp_issues()).
TableSchema$clone()The objects of this class are cloneable with this method.
TableSchema$clone(deep = FALSE)
deepWhether to make a deep clone.
# Build from an in-memory schema (offline): sch <- list(name = "deployments", fields = list( list(name = "deploymentID", type = "string", constraints = list(required = TRUE)), list(name = "latitude", type = "number")), primaryKey = "deploymentID") schema <- TableSchema$new("deployments", json = sch) schema$field_names() schema$required_field_names() ## Not run: # Or fetch the official schema for a Camtrap DP version (needs internet): TableSchema$new("deployments", version = "1.0.1") ## End(Not run)# Build from an in-memory schema (offline): sch <- list(name = "deployments", fields = list( list(name = "deploymentID", type = "string", constraints = list(required = TRUE)), list(name = "latitude", type = "number")), primaryKey = "deploymentID") schema <- TableSchema$new("deployments", json = sch) schema$field_names() schema$required_field_names() ## Not run: # Or fetch the official schema for a Camtrap DP version (needs internet): TableSchema$new("deployments", version = "1.0.1") ## End(Not run)
Example deployment data for a single camera trap (at NIES, Japan), used in the single-camera vignette. One row.
VdepVdep
A data frame with 1 row and 14 variables:
Unique identifier of the deployment.
Longitude in decimal degrees (WGS84).
Latitude in decimal degrees (WGS84).
Identifier of the deployment location.
Deployment start date.
Deployment start time.
Deployment end date.
Deployment end time.
Identifier of the camera.
Manufacturer and model of the camera.
Camera delay.
Height at which the camera was deployed.
Whether bait was used.
Name or identifier of the person/organization that deployed the camera.
Originally data can be used from doi:10.34462/0002000233
Example observation data for the Vdep single camera-trap deployment, used in
the single-camera vignette. One row per observation; filename is the video
file from which the media table is built.
VobsVobs
A data frame with 38 rows and 13 variables:
Institution code.
Collection code.
Video identifier.
Identifier of the deployment location.
Date the video was recorded.
Time the video was recorded.
Recorded object category.
Taxonomic class of the observed organism.
Genus of the observed organism.
Species epithet of the observed organism.
Number of observed individuals.
Identifier of the SD card.
Name of the video file (used to build media).
Originally data can be used from doi:10.34462/0002000233