| Title: | Normalize and Match City Names to NUTS Regions |
|---|---|
| Description: | Normalizes city names for EEA countries and matches them to NUTS 3 regions using provided crosswalks. Features include comprehensive normalization rules, cascading matching logic (Exact NUTS -> Exact LAU -> Fuzzy), and single-source data synthesis. The package implements the NUTS classification as described in the NUTS methodology (Eurostat (2021) <https://ec.europa.eu/eurostat/web/nuts>). |
| Authors: | Giulian Etingin-Frati [aut, cre] |
| Maintainer: | Giulian Etingin-Frati <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.2.4 |
| Built: | 2026-06-04 07:37:09 UTC |
| Source: | https://github.com/cran/placematchr |
Generates a vector of fake city names for testing, including common variations and noise.
generate_fake_cities(n = 10, country = "DE")generate_fake_cities(n = 10, country = "DE")
n |
Integer, matching number of cities to generate. |
country |
"DE" or "CH". |
Character vector of city names.
# Generate 5 fake German cities generate_fake_cities(5, country = "DE") # Generate 3 fake Swiss cities generate_fake_cities(3, country = "CH")# Generate 5 fake German cities generate_fake_cities(5, country = "DE") # Generate 3 fake Swiss cities generate_fake_cities(3, country = "CH")
Datasets containing mappings from city names to LAU codes and NUTS 3 regions for various countries. The data handles string normalization and matches cities to their respective statistical regions.
lau_at lau_be lau_bg lau_ch lau_cy lau_cz lau_de lau_dk lau_ee lau_el lau_es lau_fi lau_fr lau_hr lau_hu lau_ie lau_it lau_li lau_lt lau_lu lau_lv lau_mk lau_mt lau_nl lau_no lau_pl lau_pt lau_ro lau_se lau_si lau_sk lau_trlau_at lau_be lau_bg lau_ch lau_cy lau_cz lau_de lau_dk lau_ee lau_el lau_es lau_fi lau_fr lau_hr lau_hu lau_ie lau_it lau_li lau_lt lau_lu lau_lv lau_mk lau_mt lau_nl lau_no lau_pl lau_pt lau_ro lau_se lau_si lau_sk lau_tr
Data frames with varying columns depending on the country, typically including:
Local Administrative Unit code
Name of the Local Administrative Unit
NUTS 3 region code
Population (if available)
An object of class data.frame with 2093 rows and 5 columns.
An object of class data.frame with 571 rows and 5 columns.
An object of class data.frame with 265 rows and 5 columns.
An object of class data.frame with 2135 rows and 5 columns.
An object of class data.frame with 617 rows and 5 columns.
An object of class data.frame with 6258 rows and 5 columns.
An object of class data.frame with 10972 rows and 5 columns.
An object of class data.frame with 99 rows and 5 columns.
An object of class data.frame with 79 rows and 5 columns.
An object of class data.frame with 6142 rows and 5 columns.
An object of class data.frame with 8132 rows and 5 columns.
An object of class data.frame with 309 rows and 5 columns.
An object of class data.frame with 32774 rows and 5 columns.
An object of class data.frame with 556 rows and 5 columns.
An object of class data.frame with 3155 rows and 5 columns.
An object of class data.frame with 166 rows and 5 columns.
An object of class data.frame with 7900 rows and 5 columns.
An object of class data.frame with 11 rows and 5 columns.
An object of class data.frame with 60 rows and 5 columns.
An object of class data.frame with 100 rows and 5 columns.
An object of class data.frame with 43 rows and 5 columns.
An object of class data.frame with 80 rows and 5 columns.
An object of class data.frame with 68 rows and 5 columns.
An object of class data.frame with 342 rows and 5 columns.
An object of class data.frame with 378 rows and 5 columns.
An object of class data.frame with 2477 rows and 5 columns.
An object of class data.frame with 3092 rows and 5 columns.
An object of class data.frame with 3181 rows and 5 columns.
An object of class data.frame with 290 rows and 5 columns.
An object of class data.frame with 211 rows and 5 columns.
An object of class data.frame with 2927 rows and 5 columns.
An object of class data.frame with 972 rows and 5 columns.
Eurostat and national statistical institutes.
Matches a vector of city names to NUTS 3 regions using a cascading logic for any supported country.
match_city(x, country = "DE", fuzzy = TRUE, threshold = 0.95)match_city(x, country = "DE", fuzzy = TRUE, threshold = 0.95)
x |
Character vector of city names. |
country |
Character string of two-letter country code (e.g. "DE", "FR"). |
fuzzy |
Logical, whether to perform fuzzy matching. |
threshold |
Numeric, similarity threshold for fuzzy matching (0-1). |
A data frame with columns: original, city_clean, nuts_3_id, lau_name, match_type, similarity.
# Match German cities cities <- c("Berlin", "Munich", "Hamburg") match_city(cities, country = "DE") # Match with exact matching only (no fuzzy) match_city(c("Frankfurt am Main"), country = "DE", fuzzy = FALSE)# Match German cities cities <- c("Berlin", "Munich", "Hamburg") match_city(cities, country = "DE") # Match with exact matching only (no fuzzy) match_city(c("Frankfurt am Main"), country = "DE", fuzzy = FALSE)
Normalizes city names for EEA countries using comprehensive rules tailored to each language/region.
normalize_city(x, country = "DE")normalize_city(x, country = "DE")
x |
Character vector of city names. |
country |
Character string of the ISO 2-character country code (e.g. "DE", "FR", "PL"). |
Character vector of normalized names.
# Normalize German city names # Normalize German city names normalize_city(c("M\u00FCnchen", "K\u00F6ln", "Frankfurt a.M."), country = "DE") # Normalize Swiss city names normalize_city(c("Z\u00FCrich", "Gen\u00E8ve", "Basel-Stadt"), country = "CH")# Normalize German city names # Normalize German city names normalize_city(c("M\u00FCnchen", "K\u00F6ln", "Frankfurt a.M."), country = "DE") # Normalize Swiss city names normalize_city(c("Z\u00FCrich", "Gen\u00E8ve", "Basel-Stadt"), country = "CH")
Metadata for NUTS 3 regions for various countries, used for hierarchical matching.
nuts_at nuts_be nuts_bg nuts_ch nuts_cy nuts_cz nuts_de nuts_dk nuts_ee nuts_el nuts_es nuts_fi nuts_fr nuts_hr nuts_hu nuts_ie nuts_it nuts_li nuts_lt nuts_lu nuts_lv nuts_mk nuts_mt nuts_nl nuts_no nuts_pl nuts_pt nuts_ro nuts_se nuts_si nuts_sk nuts_trnuts_at nuts_be nuts_bg nuts_ch nuts_cy nuts_cz nuts_de nuts_dk nuts_ee nuts_el nuts_es nuts_fi nuts_fr nuts_hr nuts_hu nuts_ie nuts_it nuts_li nuts_lt nuts_lu nuts_lv nuts_mk nuts_mt nuts_nl nuts_no nuts_pl nuts_pt nuts_ro nuts_se nuts_si nuts_sk nuts_tr
Data frames with columns:
NUTS 3 region code
Name of the NUTS 3 region
An object of class data.frame with 35 rows and 4 columns.
An object of class data.frame with 43 rows and 4 columns.
An object of class data.frame with 28 rows and 4 columns.
An object of class data.frame with 26 rows and 4 columns.
An object of class data.frame with 1 rows and 4 columns.
An object of class data.frame with 14 rows and 4 columns.
An object of class data.frame with 401 rows and 4 columns.
An object of class data.frame with 11 rows and 4 columns.
An object of class data.frame with 5 rows and 4 columns.
An object of class data.frame with 53 rows and 4 columns.
An object of class data.frame with 59 rows and 4 columns.
An object of class data.frame with 19 rows and 4 columns.
An object of class data.frame with 96 rows and 4 columns.
An object of class data.frame with 21 rows and 4 columns.
An object of class data.frame with 20 rows and 4 columns.
An object of class data.frame with 8 rows and 4 columns.
An object of class data.frame with 107 rows and 4 columns.
An object of class data.frame with 1 rows and 4 columns.
An object of class data.frame with 10 rows and 4 columns.
An object of class data.frame with 1 rows and 4 columns.
An object of class data.frame with 5 rows and 4 columns.
An object of class data.frame with 8 rows and 4 columns.
An object of class data.frame with 2 rows and 4 columns.
An object of class data.frame with 40 rows and 4 columns.
An object of class data.frame with 17 rows and 4 columns.
An object of class data.frame with 73 rows and 4 columns.
An object of class data.frame with 26 rows and 4 columns.
An object of class data.frame with 42 rows and 4 columns.
An object of class data.frame with 21 rows and 4 columns.
An object of class data.frame with 12 rows and 4 columns.
An object of class data.frame with 8 rows and 4 columns.
An object of class data.frame with 81 rows and 4 columns.
Eurostat