Package 'umweltapir' reference manual

Title:	Access 'umwelt.info' API
Description:	Provides an R-based access to the datasets including their resources from the portal <https://umwelt.info>. The package allows for an easy integration of those datasets into your R-based workflows. The functionality of the package mirrors the web-based access as provided at <https://umwelt.info>. You can use the same queries and get the same datasets by accessing our API.
Authors:	Johannes Vogel [aut, cre], Maximilian Berthold [aut], Luise Quoß [ctb], Nationales Zentrum für Umwelt- und Naturschutzinformationen [cph]
Maintainer:	Johannes Vogel <[email protected]>
License:	MIT + file LICENSE \| Apache License 2.0
Version:	0.2.1
Built:	2026-06-24 10:41:21 UTC
Source:	https://github.com/cran/umweltapir

Download resources from umwelt.info

Description

Download all the resources attached to the datasets of a respective query:

Usage

unnest_and_filter(
  data,
  formats = c("Csv", "Zip", "Json", "JsonLd", "GeoJson", "Tsv", "Pdf",
    "MicrosoftExcelSpreadsheet"),
  description_regex = NULL
)

preview_resources(data_preprocessed)

download_resources(data_preprocessed, base_dir = tempdir())
unnest_and_filter(
  data,
  formats = c("Csv", "Zip", "Json", "JsonLd", "GeoJson", "Tsv", "Pdf",
    "MicrosoftExcelSpreadsheet"),
  description_regex = NULL
)

preview_resources(data_preprocessed)

download_resources(data_preprocessed, base_dir = tempdir())

Arguments

data

An unnested and optionally filtered R dataframe which is written as output by one of the functions fetch_by_url, fetch_by_query or fetch_by_id

formats

A list of strings indicating accepted output formats of the recoures Possible values: see ResourceType in the list of Schemas at the bottom of the API documentation at https://md.umwelt.info/swagger-ui/ to get a list of existing formats

description_regex

A string to filter only resources with a description containing the string

data_preprocessed

An unnested and optionally filtered R dataframe which is written as output by unnest_and_filter

base_dir

A directory where downloaded resources should be stored.

Value

No return value (resources are downloaded into the respective folder)

Examples

# To download the resources the workflow contains four steps. First you fetch the list of all
# datasets belonging to your query.
# The input link for the query can be generated in the interface of https://umwelt.info.
# See the tutorial
# https://umwelt.info/artikel/so-laden-sie-daten-bei-umweltinfo-mit-python-und-r-herunter
# for further details.
# In a second step the required columns are unnested and you can optionally filter for certain
# file formats (in the example here "MicrosoftExcelSpreadsheet" and "Csv") and create a subset of
# only those entries where the resource description contains the query (in this example "Ozon").
# Note that the unnesting is a prerequisite for preview_resources() and download_resources().
# Third, you create a preview of the resulting resources which would be downloaded.
# If you want to proceed, you can initiate the download in the fourth and final step.
if (interactive()) {
  url <-
  "https://md.umwelt.info/search/all?query=(Ozon)+AND+organisation%3A%2FLand%2FBayern%2Fopen.bydata"
  results <- fetch_by_url(url,
    columns = "resource_only"
  ) |>
    unnest_and_filter(formats =  c("MicrosoftExcelSpreadsheet", "Csv"),
    description_regex = "Ozon") |>
    preview_resources()
  results |> download_resources(base_dir = tempdir())
}
# To download the resources the workflow contains four steps. First you fetch the list of all
# datasets belonging to your query.
# The input link for the query can be generated in the interface of https://umwelt.info.
# See the tutorial
# https://umwelt.info/artikel/so-laden-sie-daten-bei-umweltinfo-mit-python-und-r-herunter
# for further details.
# In a second step the required columns are unnested and you can optionally filter for certain
# file formats (in the example here "MicrosoftExcelSpreadsheet" and "Csv") and create a subset of
# only those entries where the resource description contains the query (in this example "Ozon").
# Note that the unnesting is a prerequisite for preview_resources() and download_resources().
# Third, you create a preview of the resulting resources which would be downloaded.
# If you want to proceed, you can initiate the download in the fourth and final step.
if (interactive()) {
  url <-
  "https://md.umwelt.info/search/all?query=(Ozon)+AND+organisation%3A%2FLand%2FBayern%2Fopen.bydata"
  results <- fetch_by_url(url,
    columns = "resource_only"
  ) |>
    unnest_and_filter(formats =  c("MicrosoftExcelSpreadsheet", "Csv"),
    description_regex = "Ozon") |>
    preview_resources()
  results |> download_resources(base_dir = tempdir())
}

Fetch data from umwelt.info

Description

These functions allow you to retrieve datasets from the umwelt.info metadata search API either by providing a search query or a complete URL.

Usage

fetch_by_query(query, language = "de", columns = NULL)

fetch_by_url(url, columns = NULL)

fetch_by_ids(ids)

fetch_facet_values(name = "type")
fetch_by_query(query, language = "de", columns = NULL)

fetch_by_url(url, columns = NULL)

fetch_by_ids(ids)

fetch_facet_values(name = "type")

Arguments

query

A character vector containing the search query (e.g., "Ozon").

language

A string to determine the language of the search results. Possible values: de and en. Default: de (German).

columns

Either a vector of strings containing the selected columns or "resource_only" as a shortcut to select the columns "source", "id", "resources", "title" and "quality". Possible values: source, id, title, description, types, comment, license, mandatory_registration, organisations, persons, tags, regions, issued, modified, update_frequency, source_url, source_url_explainer, source_url_type, machine_readable_source, alternatives, resources, language, bounding_boxes, time_ranges, global_identifier, quality, status, last_harvest, resource_only

url

A character string containing the full API request URL.

ids

A list of character strings containing the ID(s) of datasets.

name

A character string containing the name of the facet for which all possible values will be returned (list). These can be used to create a new query. Possible values: type, topic, organisation, license, language and resource_type. Default value: type.

Value

A tibble containing the dataset entries. Returns an empty tibble if no results are found.

Examples

# Example 1: Fetching by a direct URL
if (interactive()) {
  api_url <- "https://md.umwelt.info/search/all?query=Luftqualität"
  result_list <- fetch_by_url(api_url)
}

# Example 2: Fetching by query string
# For background how to build a query see https://md.umwelt.info/swagger-ui/#/search/text_search
# If you want to know which facet values exist for a certain facet, you can use
# fetch_facet_values (see example 5).
if (interactive()) {
  result_list <- fetch_by_query("organisation:/Land/Bayern/open.bydata AND Ozon AND license:/Offen")
}

# Example 3: Select subset of columns and unnest columns (here the column "resources" is unnested
# into its subcolums "type", "url", "description", "direct_link" and "primary_content") and in a
# second step "type" is further unnested into "path" and "label"
if (interactive()) {
  result_list <- fetch_by_query("(Ozon) AND organisation:/Land/Bayern/open.bydata")
  colnames(result_list) # columns before unnesting
  result_list <- result_list |>
    tidyr::unnest(col = c("source")) |>
    (\(df) df[, c("source", "id", "resources", "title", "quality"), drop = FALSE])() |>
    tidyr::unnest(col = c("resources")) |>
    tidyr::unnest(col = c("type"))
  colnames(result_list) # columns after unnesting)
}

# Example 4: Fetching by a list of dataset IDs. This can e.g. be useful for downloading resources.
# After using preview_resources you can select a subset of from the preview list using
# fetch_by_ids() and forward it as input to download_resources().
ids <- c(
  "uvk-be/-sen-uvk-umwelt-luft-luftqualitaet-",
  "lanuk-nrw/-publikationen-publikation-bericht-ueber-die-luftqualitaet-im-jahre-2014",
  "metaver-hb/7F0A29F5-ECBC-476D-9C99-DC1A6A8043D0"
)
datasets <- fetch_by_ids(ids)
for (i in 1:dim(datasets)[1]) {
  print(datasets[i,]$title)
}

# Example 5: Fetching all possible values for the facet name organisation. This can be e.g. useful
# if you want to restrict the results to certain organisations when build your own query,
# so you know which organisations are available.
name <- "organisation"
organisations <- fetch_facet_values(name)
head(organisations)

# Example 6: Fetch multiple facets at the same time
# If you want to fetch more than one facet, the easiest way is to use fetch_by_query() for this.
if (interactive()) {result_list  <- fetch_by_query(
 query =
 "organisation:/Bund/Destatis OR organisation:'/Land/Statistische Ämter des Bundes und der Länder'")
}
# Example 1: Fetching by a direct URL
if (interactive()) {
  api_url <- "https://md.umwelt.info/search/all?query=Luftqualität"
  result_list <- fetch_by_url(api_url)
}

# Example 2: Fetching by query string
# For background how to build a query see https://md.umwelt.info/swagger-ui/#/search/text_search
# If you want to know which facet values exist for a certain facet, you can use
# fetch_facet_values (see example 5).
if (interactive()) {
  result_list <- fetch_by_query("organisation:/Land/Bayern/open.bydata AND Ozon AND license:/Offen")
}

# Example 3: Select subset of columns and unnest columns (here the column "resources" is unnested
# into its subcolums "type", "url", "description", "direct_link" and "primary_content") and in a
# second step "type" is further unnested into "path" and "label"
if (interactive()) {
  result_list <- fetch_by_query("(Ozon) AND organisation:/Land/Bayern/open.bydata")
  colnames(result_list) # columns before unnesting
  result_list <- result_list |>
    tidyr::unnest(col = c("source")) |>
    (\(df) df[, c("source", "id", "resources", "title", "quality"), drop = FALSE])() |>
    tidyr::unnest(col = c("resources")) |>
    tidyr::unnest(col = c("type"))
  colnames(result_list) # columns after unnesting)
}

# Example 4: Fetching by a list of dataset IDs. This can e.g. be useful for downloading resources.
# After using preview_resources you can select a subset of from the preview list using
# fetch_by_ids() and forward it as input to download_resources().
ids <- c(
  "uvk-be/-sen-uvk-umwelt-luft-luftqualitaet-",
  "lanuk-nrw/-publikationen-publikation-bericht-ueber-die-luftqualitaet-im-jahre-2014",
  "metaver-hb/7F0A29F5-ECBC-476D-9C99-DC1A6A8043D0"
)
datasets <- fetch_by_ids(ids)
for (i in 1:dim(datasets)[1]) {
  print(datasets[i,]$title)
}

# Example 5: Fetching all possible values for the facet name organisation. This can be e.g. useful
# if you want to restrict the results to certain organisations when build your own query,
# so you know which organisations are available.
name <- "organisation"
organisations <- fetch_facet_values(name)
head(organisations)

# Example 6: Fetch multiple facets at the same time
# If you want to fetch more than one facet, the easiest way is to use fetch_by_query() for this.
if (interactive()) {result_list  <- fetch_by_query(
 query =
 "organisation:/Bund/Destatis OR organisation:'/Land/Statistische Ämter des Bundes und der Länder'")
}

Package 'umweltapir'

Help Index

Download resources from umwelt.info

Description

Usage

Arguments

Value

Examples

Fetch data from umwelt.info

Description

Usage

Arguments

Value

Examples