Package: taxify 0.2.12

Gilles Colling

taxify: Offline Taxonomic Name Matching Against Local Darwin Core Snapshots

Match taxonomic names against locally stored Darwin Core backbone databases ('WFO', 'COL', 'GBIF', 'ITIS', 'NCBI Taxonomy', 'Open Tree of Life', 'WoRMS', 'Species Fungorum', 'AlgaeBase'). Provides offline fuzzy and exact matching with synonym resolution, hybrid name detection, and a unified output schema across all sources. All heavy computation runs in the 'vectra' C11 columnar engine.

Authors:Gilles Colling [aut, cre, cph]

taxify_0.2.12.tar.gz
taxify_0.2.12.tar.gz(r-4.7-any)taxify_0.2.12.tar.gz(r-4.6-any)
taxify_0.2.12.tgz(r-4.6-emscripten)
manual.pdf |manual.html
DESCRIPTION |NEWS
card.svg |card.png
taxify/json (API)

# Install 'taxify' in R:
install.packages('taxify', repos = c('https://cran.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/gcol33/taxify/issues

On CRAN:

Conda:

4.42 score 1 packages 22 scripts 53 exports 11 dependencies

Last updated from:327dd964ce. Checks:4 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK302
source / vignettesOK214
linux-release-x86_64OK189
wasm-releaseOK170

Exports:add_algae_traitsadd_alien_first_recordsadd_amphibioadd_anageadd_animaltraitsadd_arthropod_traitsadd_avonetadd_basefloradd_col_infoadd_common_namesadd_conservation_statusadd_dataadd_diaz_traitsadd_ecofloraadd_eiveadd_elton_traitsadd_fish_traitsadd_fishbaseadd_florawebadd_fungal_traitsadd_funguildadd_gbif_infoadd_glonafadd_hybrid_infoadd_invasive_statusadd_ledaadd_leptraitsadd_lizard_traitsadd_pantheriaadd_pignattiadd_qualifier_infoadd_wcvpadd_wfo_infoadd_woodinessciteembed_acceptedexport_datalist_enrichmentslookup_genusnormalize_epithetsprecompute_keysscore_candidatestaxifytaxify_clear_cachetaxify_data_dirtaxify_downloadtaxify_download_enrichmenttaxify_download_vtrtaxify_example_datataxify_load_registertaxify_longtaxify_refresh_manifesttaxify_register_coverage

Dependencies:clicurlgluejsonlitelibgeoslifecyclerlangtidyselectvctrsvectrawithr

Choosing and combining backends
Backend overview | Downloading backbones | Single-backend matching | Backend-specific output differences | Multi-backend fallback chains | Worked example: plants-only with WFO vs WFO + COL | Worked example: mixed-kingdom list with COL + GBIF + WoRMS | Worked example: fungi with Species Fungorum + COL fallback | Worked example: algae | Worked example: molecular ecology with NCBI | The backend column | Backend-specific extras | WFO extras: add_wfo_info() | COL extras: add_col_info() | GBIF extras: add_gbif_info() | Combining extras in a multi-backend result | The genus register | Looking up a genus | Checking backend coverage | Out-of-scope detection in practice | Practical guidance: choosing backends | Performance considerations | Reproducibility

Last update: 2026-06-30
Started: 2026-06-30

Enriching results with trait and status data
How enrichments work | The join in detail | Cross-backbone name resolution | Automatic download and caching | Fallback chain | ### The enrichment data directory | Discovering enrichments | Pre-downloading enrichments | Simple enrichments | Plant enrichments | Woodiness (Zanne et al. 2014) | EIVE ecological indicator values (Dengler et al. 2023) | Diaz traits (Diaz et al. 2022) | LEDA Traitbase (Kleyer et al. 2008) | Regional plant-trait compilations (Baseflor, Ecoflora, FloraWeb) | Conservation status (IUCN Red List) | Bird enrichments | AVONET (Tobias et al. 2022) | EltonTraits (Wilman et al. 2014) | Mammal enrichments | PanTHERIA (Jones et al. 2009) | Amphibian enrichments | AmphiBIO (Oliveira et al. 2017) | Fungal enrichments | FungalTraits (Polme et al. 2020) | FUNGuild (Nguyen et al. 2016) | Algae enrichments | AlgaeTraits (Vranken et al. 2023) | Fish enrichments | FISHMORPH (Brosse et al. 2021) | FishBase (Froese & Pauly 2024) | Reptile enrichments | Meiri lizard traits (Meiri 2018) | Vertebrate enrichments (cross-class) | AnAge longevity and life-history (Tacutu et al. 2018) | AnimalTraits body mass and metabolic rate (Hebert et al. 2022) | Butterfly enrichments | LepTraits butterfly traits (Shirey et al. 2022) | Arthropod enrichments | NW European Arthropod life-history traits (Logghe et al. 2025) | Group-based enrichments | Invasive species status (GRIIS) | Single country | Multiple countries | All countries | Alien species first records (Seebens et al.) | Reshaping to long format | Native range by botanical region (WCVP) | Naturalized alien flora by region (GloNAF) | Common (vernacular) names (GBIF) | Stacking enrichments | Coverage patterns | Approximate coverage rates by enrichment | Interpreting NA columns | The enrichment register in summary() output | Practical guidance: which enrichments for which taxa | Vascular plants (European) | Vascular plants (global) | Birds | Mammals | Amphibians | Fish | Reptiles (lizards) | Butterflies | Arthropods (NW European) | Fungi | Macroalgae (European) | Mixed-taxon datasets | Joining custom data | Data provenance and citation | Summary

Last update: 2026-06-30
Started: 2026-06-30

Fuzzy matching: methods, thresholds, and tuning
What fuzzy matching does (and does not do) | The three distance methods | Damerau-Levenshtein (default, fuzzy_method = "dl") | Levenshtein (fuzzy_method = "levenshtein") | Jaro-Winkler (fuzzy_method = "jw") | How thresholds work | Fractional mode (0 < threshold < 1) | Integer mode (threshold >= 1) | What happens before fuzzy matching | Worked example 1: clean names that need no fuzzy matching | Worked example 2: OCR-degraded and hand-typed names | Worked example 3: threshold too loose | Worked example 4: comparing all three methods | The fuzzy_dist column | Genus-blocked matching and misspelled genera | Practical guidance | When to disable fuzzy matching | When to tighten the threshold | When to loosen the threshold | When to switch methods | Using integer thresholds for uniform error budgets | A two-pass workflow for messy data | Summary of output columns related to fuzzy matching

Last update: 2026-06-30
Started: 2026-06-30

Getting started with taxify
Why taxify | Installing a backbone | Basic matching | Understanding the output | Name cleaning | Synonym resolution | Match types | The summary method | Multi-backend fallback | Enrichments | Conservation status | Common names | Woodiness | Custom data | From a data.frame | From a CSV file | Supported file formats | Hybrid names | Genus register | Cache management | The full pipeline

Last update: 2026-06-30
Started: 2026-06-30

Hybrid name detection and parsing
Hybrid names in taxonomy | How taxify detects hybrids | Worked example: matching a mixed species list | Extracting hybrid details with add_hybrid_info() | Worked example: parsing hybrid formulas | What matches and what does not | The multiplication sign and its substitutes | Practical notes

Last update: 2026-06-30
Started: 2026-06-30

Joining custom data with add_data()
The problem | Joining a data.frame | Joining from a CSV file | Joining from an Excel file | SQLite databases | vectra native format | Joining from a TSV file | Other file formats | Species column auto-detection | Selecting columns with cols | Column name collisions | Duplicate species handling | How the join works | Controlling fuzzy matching | Combining add_data() with enrichments

Last update: 2026-06-30
Started: 2026-06-30

Migrating from taxize, WorldFlora, and related tools
The taxonomic-resolution landscape in R | Function mapping: taxize to taxify | Function mapping: WorldFlora to taxify | Function mapping: lcvplants to taxify | Function mapping: rWCVP to taxify | Function mapping: taxadb to taxify | Function mapping: Taxonstand to taxify | Example 1: Basic name resolution | Example 2: WFO matching with fuzzy + synonyms | Example 3: Multi-backend fallback with enrichments | Key differences at a glance | What taxify does not do | When the other packages are the better choice | Discovering available enrichments | Summary

Last update: 2026-06-30
Started: 2026-06-30

Working with large species lists
How taxify scales | The .vtr columnar format | Exact matching: hash-indexed lookups | Fuzzy matching: genus-blocked string distance | Backbone loading and the session cache | Worked example: exact vs. fuzzy matching | Worked example: multi-backend fallback ordering | Backbone sizes on disk | Memory footprint | Worked example: batch processing a very large list | Cache management | Disk storage and sharing across projects | Worked example: pre-downloading resources | Practical scaling guidance | The fuzzy_threshold parameter | Summary of performance-relevant functions

Last update: 2026-06-30
Started: 2026-06-30

Readme and manuals

Help Manual

Help pageTopics
Add macroalgal functional traits (AlgaeTraits)add_algae_traits
Add alien species first record yearsadd_alien_first_records
Add amphibian life-history traits (AmphiBIO)add_amphibio
Add longevity and life-history traits (AnAge)add_anage
Add cross-taxon body mass and metabolic rate (AnimalTraits)add_animaltraits
Add arthropod life-history traits (NW European Arthropods)add_arthropod_traits
Add bird morphology and migration (AVONET)add_avonet
Add plant traits from Baseflor (Catminat / Julve)add_baseflor
Add COL-specific columnsadd_col_info
Add common (vernacular) namesadd_common_names
Add conservation statusadd_conservation_status
Add custom data by taxonomic matchingadd_data
Add seed mass and plant height (Diaz et al. 2022)add_diaz_traits
Add British plant traits from Ecofloraadd_ecoflora
Add EIVE ecological indicator valuesadd_eive
Add diet, foraging, and body mass (EltonTraits 1.0)add_elton_traits
Add freshwater fish morphological traits (FISHMORPH)add_fish_traits
Add fish traits (FishBase)add_fishbase
Add German plant traits from FloraWebadd_floraweb
Add fungal lifestyle and trait data (FungalTraits)add_fungal_traits
Add fungal functional guild data (FUNGuild)add_funguild
Add GBIF-specific columnsadd_gbif_info
Add naturalized alien flora status (GloNAF)add_glonaf
Add hybrid parent and type informationadd_hybrid_info
Add invasive species statusadd_invasive_status
Add plant traits from LEDA Traitbaseadd_leda
Add butterfly traits (LepTraits)add_leptraits
Add lizard life-history and ecological traits (Meiri 2018)add_lizard_traits
Add mammal life-history traits (PanTHERIA)add_pantheria
Add Italian plant traits from Pignatti (on demand, via TR8)add_pignatti
Add qualifier informationadd_qualifier_info
Add WCVP native range statusadd_wcvp
Add WFO-specific columnsadd_wfo_info
Add woodiness classificationadd_woodiness
Cite data sources used in a taxify resultcite
Export a taxify result to fileexport_data
List available enrichmentslist_enrichments
Look up a genus in the registerlookup_genus
Print a taxify_resultprint.taxify_result
Summarise a taxify_resultsummary.taxify_result
Match taxonomic names against local backbone databasestaxify
Clear all cached backbonestaxify_clear_cache
Get the taxify data directorytaxify_data_dir
Download a backbone databasetaxify_download
Download one or more enrichment .vtr filestaxify_download_enrichment
Download a taxify backbonetaxify_download_vtr
Path to the bundled example databasetaxify_example_data
Load the unified genus register into memorytaxify_load_register
Reshape grouped enrichment columns to long formattaxify_long
Invalidate the session manifest cachetaxify_refresh_manifest
Show backend coverage for a genustaxify_register_coverage