taxify_data_dir() can now be redirected with the taxify.data_dir option
or the TAXIFY_DATA_DIR environment variable, so the cache location is
configurable (shared caches, scratch directories, the bundled example data).taxify_example_data() returns the path to a small bundled example database
(a handful of species per backbone plus matching enrichment tables). Setting
options(taxify.data_dir = taxify_example_data()) lets matching and
enrichment run fully offline.\dontrun{}. Only add_pignatti() (fetched live via TR8) and
list_enrichments() (reads the online manifest) remain in \donttest{}.add_floraweb() joins German-flora plant traits from FloraWeb (the live BfN
portal carrying the BiolFlor data of Klotz, Kuehn & Durka 2002, plus
Rothmaler morphology and Ellenberg indicator values). It bundles the full
per-species trait profile -- morphology, reproductive biology, the nine
Ellenberg indicator values, ploidy and chromosome number, and chorological
distribution (59 _de columns) -- as a pre-built dataset, so it works
offline.add_ecoflora() now joins a bundled, pre-built Ecoflora dataset (18 _uk
columns: canopy height, leaf traits, life form, flowering phenology,
pollination, seed weight, and British-calibrated Ellenberg values) instead
of fetching live through TR8. It works offline and returns the full trait
set rather than the previous five columns.
add_pignatti() remains an on-demand TR8 source: its values are from a
copyrighted publication and cannot be redistributed.
add_ecoflora(), add_biolflor(), and add_pignatti() join plant traits
that taxify does not ship as a pre-built dataset, accessing them on demand
through the suggested TR8 package on your own machine; taxify redistributes
nothing. The reasons differ by source: add_ecoflora() adds British
flowering months, pollen vector, life form, and leaf longevity (CC BY-NC-SA,
which would allow redistribution, but ecoflora.org.uk has no bulk download,
so it is fetched live per species); add_biolflor() adds Grime CSR strategy
type, breeding system, pollen vector, life form, life span, and apomixis
(usable with acknowledgement + citation per the BioFresh metadata statement,
but no bulk copy is obtainable while the UFZ site is offline, so fetched
live); add_pignatti() adds Italian Ellenberg-type indicator values, life
form, and chorotype (copyrighted; read from the copy bundled in TR8, which
TR8 redistributes, not taxify; works offline). Columns are region-suffixed
(_uk/_de/_it) so they never collide with add_baseflor(). TR8 is a
Suggests dependency. If a live source (Ecoflora, BiolFlor) is unreachable the
call errors rather than attaching silent NA.add_baseflor() joins plant traits from Baseflor (Programme Catminat,
Julve 1998 ff.; ODbL 1.0 / CC BY-SA 2.0) to a taxify() result. It covers
~7,000 vascular plant taxa of France and neighbouring regions and adds
flowering phenology (flower_begin_month, flower_end_month), pollination
vector, dispersal mode, breeding system, flower colour, fruit type, woody
growth form, and the continentality and salinity indicator-value axes absent
from EIVE. The enrichment is registered in the manifest (list_enrichments())
with a pre-built .vtr; light/temperature/moisture/reaction/nutrient axes
are left to add_eive() and Raunkiaer life form to add_leda().tests/e2e/test-e2e-enrichment.R) for
the enrichment join fixed in 0.2.5 (#1). It checks that
add_conservation_status(), add_common_names(), and add_woodiness()
attach each value to the row's own accepted taxon, stay invariant to batch
composition and order, and land documented values on the correct species."Q. robur" now resolve. A matching pass
restricts the backbone to rows whose genus starts with the given initial and
whose specific epithet matches, resolving only when that is unique. When two
or more genera sharing the initial also share the epithet the abbreviation is
ambiguous: the row is left unmatched with is_ambiguous = TRUE and the
conflicting accepted IDs in ambiguous_targets, rather than guessing a genus.
A genus spelled out in full elsewhere in the same input takes precedence
(the convention of abbreviating after first mention). Resolved rows carry
match_type = "abbrev".accepted_authorship output column: the authorship of the resolved
accepted name. For a synonym match, authorship holds the synonym's own
author while accepted_authorship holds the accepted name's author, so
accepted_name and accepted_authorship together form the accepted taxon's
full citation. Backbones that carry authorship populate it; sources without
authorship (NCBI, OTT) return NA.taxify() no longer errors with "replacement has length zero" for backbones
whose .meta sidecar records the build date as build_date (the current
taxifydb build format) rather than download_date. Backbone metadata now
reads both layouts and version formatting tolerates a missing date. This
previously broke matching against the WoRMS and Open Tree of Life backbones.Additional_repositories
(https://gcol33.r-universe.dev), so its location is discoverable as required
for a Suggests dependency outside the mainstream repositories.taxify() no longer errors with "incorrect number of dimensions" when the
genus register is present but the backend-coverage file is not (the state on
a clean install before any coverage download, and during package checks). An
early return() evaluated inside a tryCatch() expression returned NULL
from the pre-filter, which $<- then turned into a list; the out-of-scope
pre-filter now resolves missing coverage to a no-op and preserves the result
data frame.
Replaced non-ASCII characters in roxygen documentation with ASCII equivalents so the PDF reference manual builds under LaTeX.
Ambiguous homonym synonyms now resolve to the epithet-preserving accepted
name (the homotypic basionym) instead of an arbitrary lowest-id candidate.
taxify("Pinus abies") resolves to Picea abies (not Picea polita), and
the spurious is_ambiguous flag is cleared when one candidate keeps the
specific epithet (#2). Genuinely ambiguous names (no candidate, or several,
preserving the epithet) are still flagged.
Silenced tidyselect deprecation warnings emitted during fuzzy matching.
score_candidates() is exported (kept internal in the reference index) so the
companion taxifydb build pipeline can collapse each backbone key to the
single accepted name taxify() resolves it to. This corrects enrichment
joins that previously landed trait/status values on within-genus neighbours
(#1); the fix reaches users through rebuilt enrichment data.