NEWS
tidyfinance 0.8.0 (2026-07-02)
New features
- Added
download_data_pastor_stambaugh() and the "Pastor-Stambaugh" domain
for download_data(), which downloads the liquidity factors of Pastor and
Stambaugh (2003) from
Lubos Pastor's data library.
The result carries the levels of aggregate liquidity, the non-traded
liquidity factor (innovations), and the traded liquidity factor LIQ_V.
- Added
download_data_stambaugh_yuan() and the "Stambaugh-Yuan" domain for
download_data(), which downloads the mispricing factors (mgmt and perf)
of Stambaugh and Yuan (2017) from
Robert Stambaugh's data library.
The dataset argument selects between "monthly" and "daily" data. The
source files currently end in December 2016.
- Added
download_data_jkp() and the "Global Factor Data" domain
for download_data(), which downloads data from
Global Factor Data (Jensen, Kelly, and
Pedersen, 2023). The dataset argument selects between factor returns
("factors"), the underlying long-short portfolios ("portfolios"),
industry returns ("industry"), and the reference files "nyse_cutoffs"
and "return_cutoffs". The requested selection is validated against the
library's live availability manifest, and the helper
list_supported_jkp_factors() lists the available regions and selectors.
Improvements
- The
sorting_variable column of the factor_library_grid dataset no longer
carries a "sv_" prefix, so its values now match the sorting_variable
argument of download_data("Tidy Finance", "factor_library", ...) (e.g.
"bm" rather than "sv_bm"). download_data("Tidy Finance", "factor_library_grid") returns the bare values accordingly
(#284).
- Added a
tidyfinance vignette that walks through the complete
factor-construction workflow: download, signal construction, fiscal-year
lagging, portfolio sorting, and a Fama-MacBeth test. It builds
entirely on download_data(domain = "Pseudo Data"), so it compiles without a
WRDS subscription or network access. knitr and rmarkdown are added back to
Suggests, and VignetteBuilder: knitr is restored to DESCRIPTION.
download_data("Open Source Asset Pricing") now aligns the date column to
the beginning of the month (the dataset previously returned end-of-month
dates), matching the convention used by the other download functions. All
predictor columns are monthly long-short returns expressed in percent and are
now divided by 100 to return plain numeric (decimal) returns.
Bug fixes
download_data_huggingface("factor_library", ...) now treats an explicit
n_portfolios_secondary = NULL as "remove the filter and return all values"
(univariate and bivariate sorts alike), consistent with the documented
behavior for every other column. Previously an explicit NULL was coerced to
NA, restricting the result to univariate sorts.
tidyfinance 0.7.0 (2026-06-25)
Improvements
estimate_betas() now uses a fast, vectorized closed-form approach based on
rolling cumulants of the moment matrices instead of fitting one regression
per stock and window. This removes the need for per-stock nesting and the
optional furrr parallelization, so the use_furrr argument and the furrr
dependency have been dropped. Estimates are numerically identical to the
previous regression-based implementation. Windows with fewer than min_obs
observations are now dropped from the output rather than returned with NA
coefficients.
- The package website moved from
package.tidy-finance.org to
r.tidy-finance.org.
download_data() now uses the human-readable domain names returned by
list_supported_datasets() (e.g., "Fama-French", "Global Q",
"WRDS", "Tidy Finance"). The "pseudo" and "tidyfinance" domains
were renamed to "Pseudo Data" and "Tidy Finance". The previous
machine-readable domain names (e.g., "famafrench", "wrds",
"pseudo", "tidyfinance") are soft-deprecated but still accepted.
download_data_wrds_crsp() now errors informatively when version = "v1"
is used with an end_date later than December 2024, reflecting the
discontinuation of the CRSP legacy version at the end of 2024.
- Removed the "experimental" lifecycle badge from
assign_portfolio(),
compute_breakpoints(), compute_rolling_value(), estimate_model(),
and join_lagged_values(), which are now considered stable.
tidyfinance 0.6.0 (2026-05-31)
New features
- Added
domain = "pseudo" to download_data() for generating pseudo
data with the same schema as the corresponding real domain. Supported
datasets in this release: "crsp_monthly", "crsp_daily",
"compustat_annual", "compustat_quarterly", and "ccm_links" (all
mirroring domain = "wrds"). Internally, every domain = "pseudo"
call funnels through simulate_pseudo_data(), the unexported router
that dispatches to per-dataset generators. Per-dataset entry points
(download_data_pseudo_crsp(), download_data_pseudo_compustat(),
download_data_pseudo_ccm_links()) remain exported for direct use.
All generators accept n_assets and seed arguments; identical
(seed, n_assets) yields the same identifier universe across datasets,
so pseudo CRSP and Compustat join cleanly via add_ccm_links = TRUE
or ccm_links. Daily CRSP is generated on weekdays only.
- Added
download_factor_library_grid() to fetch the
tidy-finance/factor-library-grid dataset from Hugging Face. Also
accessible via download_data("tidyfinance", "factor_library_grid").
Improvements
- Added
test-coverage.yaml workflow and badge to README.
- Added tests to get coverage to 100% (excl.
set_wrds_credentials()).
- Fama-French factor data is now downloaded and parsed internally via
httr2,
so frenchdata is no longer declared in Imports. The behavior of
download_data_factors_ff() is unchanged.
download_data("tidyfinance", "factor_library", ...) now honors the
canonical start_date and end_date arguments, filtering the returned
portfolio returns to the requested range. When both are omitted, the full
history is returned and the standard "Returning the full data set" message
is emitted (via validate_dates()). Previously these arguments were
accepted but silently ignored for the factor library.
- Removed the
using-tidyfinance and dates-in-tidyfinance vignettes.
Both predated the current download_data() interface and are
superseded by the package manuscript. knitr and rmarkdown are no
longer declared in Suggests, and VignetteBuilder has been dropped
from DESCRIPTION.
download_data("tidyfinance", "factor_library", ids = <vector>) now
delegates directly to download_factor_library_ids(), bypassing the
grid filter. Passing ids together with filter arguments raises an
informative error.
- Renamed
list_supported_types() to list_supported_datasets()
(#242). The
old name remains exported as a soft-deprecated alias that forwards to the
new function. Internal helpers were renamed accordingly
(e.g. list_supported_types_ff() -> list_supported_datasets_ff()).
download_data_constituents() now drops symbols equal to "-".
- Renamed
only_us parameter in download_data_wrds_compustat() to only_usd
to reflect that the filter keeps USD-denominated shares only. The old name
is deprecated and forwards to only_usd with a warning.
- Removed
arrow, glue, and stringr dependencies and added nanoparquet.
tidyfinance 0.5.0 (2026-05-12)
New features
- Added
implement_portfolio_sort() as a convenience wrapper that combines sample construction filtering and portfolio return computation into a single call.
- Added
download_data_risk_free() to download and process risk-free rate
data from FRED, splicing TB3MS (pre-2001) with DTB4WK (from 2001
onwards) for monthly data, and using DTB3 for daily data. Also
accessible via download_data("tidyfinance", "risk_free").
- Updated
download_data_wrds_crsp() to use download_data_risk_free()
(FRED-based) instead of the Kenneth French risk-free rate when
computing excess returns.
- Added
download_data_risk_free().
- Added
only_us parameter to download_data_wrds_compustat().
- Added new parameters for common CRSP transformation tasks (
add_ccm_links, adjust_volume) to download_data_wrds_crsp().
- Added
prc_adj to "crsp_monthly" version "v1".
- Added
adjust_volume parameter for "crsp_daily" version "v1" and "v2"
to download_data_wrds_compustat().
- Added
compute_rolling_value().
- Added
output parameter to estimate_model() to also return t-stats or residuals.
- Added
join_lagged_values().
- Added more indexes to
list_supported_indexes().
- Added
download_data_huggingface() and get_available_huggingface_files(). and support for type = "hf_high_frequency_sp500".
- Deprecated
type parameter in favor of domain and dataset.
- Added
detail parameter to estimate_fama_macbeth() to include average n_obs, r_squared, and adj_r_squared.
- Removed lower bound of excess returns in
download_data_wrds_crsp().
- Removed
add_lag_columns() in favor of add_lagged_columns().
- Added domain
"tidyfinance" with datasets "high_frequency_sp500", "factor_library", and "risk_free".
Improvements
- Removed
renv due to lack of benefits.
- Moved optional dependencies to imports for improved user experience (except for
furrr).
Bug fixes
- Removed erroneous time zone adjustment in
download_data_wrds_trace_enhanced() #133.
- Replaced tabs in
list_supported_types_ff() with underscores #134.
compute_portfolio_returns() and implement_portfolio_sort() now apply min_portfolio_size to the reported portfolio cross-section. For bivariate sorts this is the firm count per (main_portfolio, date) summed across secondary buckets, not per (main, secondary, date) cell as before. Previously, setting min_portfolio_size to the number of cells (e.g. n_main * n_secondary) silently voided every cell. Univariate behaviour is unchanged. The default has changed from 0L to 1L, so each reported portfolio is required to have at least one observation by default; pass min_portfolio_size = 0L to deactivate the check. The param documentation has also been corrected to reflect that small portfolios receive NA (not zero).
compute_long_short_returns() no longer errors with object 'top' not found when the input panel contains only one distinct portfolio (e.g., because assign_portfolio() collapsed to a single bucket on a constant sorting variable). The long-short return is now NA on such dates, consistent with "no investment, no return", instead of crashing.
tidyfinance 0.4.5 (2026-01-08)
Bug fixes
- Updated download of FRED data due to API changes.
tidyfinance 0.4.4 (2025-05-07)
Bug fixes
- Removed user agent sampling from
download_stock_prices()because they were blocked.
tidyfinance 0.4.3 (2024-12-17)
Bug fixes
download_constituents() and download_stock_prices() now also fail gracefully with informative messages instead of errors or warnings.
download_factors() returns empty data frame with date column to ensure vignettes are built even if resources are unavailable.
Improvements
- Unified
start_date and end_date validation across applications.
- Updated tests of
download_*() functions to cover unavailable or broken resources.
tidyfinance 0.4.2 (2024-12-02)
New features
- Added experimental
add_lag_columns() function that is more efficient than lag_column()
Bug fixes
download_macro_predictors(), download_factors(), and download_osap() now fail gracefully with informative messages instead of errors or warnings.
Improvements
- Updated
ccmxpf_linktable to the new WRDS default ccmxpf_lnkhist.
- Added support for "factors_q5_annual" in
download_factors_q()
- Optimized
winsorize() by reducing quantile recalculations
tidyfinance 0.4.1 (2024-09-04)
Bug fixes
- Added missing support of "wrds_trace_enhanced" and "wrds_fisd" support to
download_data_wrds().
- Added intercept to
estimate_model(), estimate_betas(), and estimate_fama_macbeth().
Improvements
- Renamed
download_data_wrds_clean_trace() to download_data_wrds_trace_enhanced() for improved consistency.
- Added
vcov_options parameter to estimate_fama_macbeth().
tidyfinance 0.4.0 (2024-08-30)
New features
- Added
list_supported_indexes() and download_data_constituents() to download index constituents.
- Added
estimate_betas() to estimate risk factor betas.
- Added
estimate_fama_macbeth() to estimate Fama-MacBeth models.
- Added
download_data_constituents() to download index constituents.
- Added
download_data_osap() to download data from Open Source Asset Pricing.
- Added
download_data_fred() to download data from Federal Reserve Economic Data.
- Added
compute_portfolio_returns() to implement different portfolio sorting approaches.
- Added
compute_long_short_returns() to quickly compute long-short portfolio returns.
- Added
compute_breakpoints() to make assign_portfolio() more flexible.
- Added
breakpoint_options() and data_options() to provide more flexibility with respect to column names.
Bug fixes
- Retained explicit missing values in
mktcap_lag in monthly CRSP.
Improvements
- Migrated to
cli for error messages and warnings.
- Aligned documentation across functions.
- Switched to
NULL for optional default values.
- Removed dependency from named placeholder that is only available from R 4.2 on.
- Removed
readxl dependency from download_data_macro_predictors().
- Removed redundant
check_if_package_installed() function.
- Updated
estimate_model() to support both estimate_betas() and estimate_fama_macbeth().
- Updated
assign_portfolio() to support compute_portfolio_returns().
- Renamed
download_data_stocks() to download_data_stock_prices() for better naming.
tidyfinance 0.3.0 (2024-07-23)
New features
- Added support for all available Fama-French datasets (check via
list_supported_types()). All type names are created from a string cleaning algorithm and are hence more consistent. We kept implicit support for legacy type names to avoid breaking existing code.
- Added new function to download stock data from Yahoo Finance:
download_data_stocks().
- Added support for
wrds_compustat_quarterly.
Bug fixes
- CRSP monthly data always contains the historically accurate stock characteristics instead of the oft misleading most recent information.
- Consistently implemented the
additional_columns option for CRSP and Compustat instead of having the error prone option to pass columns via ....
- Added replacement of
-999 by NA in Fama-French types, which was missing in the initial implementation.
Improvements
- Refactored the column name cleaning procedure in
download_data_factors() to support all available column names in the Fama-French universe.
- Made all
start_date and end_date optional with a message to user which dates are used as defaults.
- Introduced automatic checks via GitHub Actions workflows.
- Synchronized
date column and its references across WRDS types (see corresponding vignette for more information).
- Improved handling of imports with
tidyfinance-package.R file.
- Reformatted DESCRIPTION and roxygen comments for more consistency with
tidyverse style.
tidyfinance 0.2.1 (2024-07-03)
New features
- Added
domain and as_vector parameters to list_supported_types()
Bug fixes
- Replaced
... with additional_columns parameter and ensured that CRSP and Compustat types consider it correctly
- Removed
mkt_excess column from type "wrds_crsp_monthly"
Improvements
- Added
fixed = TRUE to grepl() calls with fixed strings
- Switched to
NA_real_ instead of as.double(NA)
- Switched to
toString() instead of paste0() with collapse
- Switched to
dplyr::between() instead of unequal signs
tidyfinance 0.2.0 (2024-05-29)
New features
- Added
vignettes/using-tidyfinance
- Added
set_wrds_credentials() function for a guided tour to store login data
- Added support for
"factors_ff_industry_*" data types
Bug fixes
- Removed
hml and smb columns from "wrds_crsp_monthly" output
- Fixed stock filters for
"v2" of "wrds_crsp_*" data types
Improvements
- Relaxed package version requirements as much as possible with the current set of packages
- Split up the
download_data* functions into multiple files for better maintenance
tidyfinance 0.1.0 (2024-03-05)