Title: | Wasserstein Regression and Inference |
---|---|
Description: | Implementation of the methodologies described in 1) Alexander Petersen, Xi Liu and Afshin A. Divani (2021) <doi:10.1214/20-aos1971>, including global F tests, partial F tests, intrinsic Wasserstein-infinity bands and Wasserstein density bands, and 2) Chao Zhang, Piotr Kokoszka and Alexander Petersen (2022) <doi:10.1111/jtsa.12590>, including estimation, prediction, and inference of the Wasserstein autoregressive models. |
Authors: | Xi Liu [aut, cre], Chao Zhang [aut], Matthew Coleman [aut], Alexander Petersen [aut] |
Maintainer: | Xi Liu <[email protected]> |
License: | GPL-2 |
Version: | 0.2.0 |
Built: | 2024-12-18 06:36:30 UTC |
Source: | CRAN |
Confidence Bands for Wasserstein Regression
confidenceBands( wass_regress_res, Xpred_df, level = 0.95, delta = 0.01, type = "density", figure = TRUE, fig_num = NULL )
confidenceBands( wass_regress_res, Xpred_df, level = 0.95, delta = 0.01, type = "density", figure = TRUE, fig_num = NULL )
wass_regress_res |
an object returned by the |
Xpred_df |
k-by-p matrix (or dataframe, or named vector) used for prediction. Note that Xpred_df should have the same column names with Xfit_df used in wass_regress_res |
level |
confidence level |
delta |
boundary control value in density band computation. Must be a value in the interval (0, 1/2) (default: 0.01) |
type |
'density', 'quantile' or 'both'
|
figure |
logical; if TRUE, return a sampled plot (default: TRUE) |
fig_num |
the fig_num-th row of |
This function computes intrinsic confidence bands for Xpred_df
if type
= 'quantile' and density bands if type
= 'density', and visualizes the confidence and/or density bands when figure
= TRUE.
a list containing the following lists:
den_list: |
|
quan_list: |
|
cdf_list: |
|
alpha = 2 beta = 1 n = 50 x1 = runif(n) t_vec = unique(c(seq(0, 0.05, 0.001), seq(0.05, 0.95, 0.05), seq(0.95, 1, 0.001))) set.seed(1) quan_obs = simulate_quantile_curves(x1, alpha, beta, t_vec) Xfit_df = data.frame(x1 = x1) res = wass_regress(rightside_formula = ~., Xfit_df = Xfit_df, Ytype = 'quantile', Ymat = quan_obs, Sup = t_vec) confidence_Band = confidenceBands(res, Xpred_df = data.frame(x1 = c(-0.5,0.5)), type = 'both', fig_num = 2) data(strokeCTdensity) predictor = strokeCTdensity$predictors dSup = strokeCTdensity$densitySupport densityCurves = strokeCTdensity$densityCurve xpred = predictor[2:3, ] res = wass_regress(rightside_formula = ~., Xfit_df = predictor, Ytype = 'density', Ymat = densityCurves, Sup = dSup) confidence_Band = confidenceBands(res, Xpred_df = xpred, type = 'density', fig_num = 1)
alpha = 2 beta = 1 n = 50 x1 = runif(n) t_vec = unique(c(seq(0, 0.05, 0.001), seq(0.05, 0.95, 0.05), seq(0.95, 1, 0.001))) set.seed(1) quan_obs = simulate_quantile_curves(x1, alpha, beta, t_vec) Xfit_df = data.frame(x1 = x1) res = wass_regress(rightside_formula = ~., Xfit_df = Xfit_df, Ytype = 'quantile', Ymat = quan_obs, Sup = t_vec) confidence_Band = confidenceBands(res, Xpred_df = data.frame(x1 = c(-0.5,0.5)), type = 'both', fig_num = 2) data(strokeCTdensity) predictor = strokeCTdensity$predictors dSup = strokeCTdensity$densitySupport densityCurves = strokeCTdensity$densityCurve xpred = predictor[2:3, ] res = wass_regress(rightside_formula = ~., Xfit_df = predictor, Ytype = 'density', Ymat = densityCurves, Sup = dSup) confidence_Band = confidenceBands(res, Xpred_df = xpred, type = 'density', fig_num = 1)
convert density function to quantile and quantile density function
den2Q_qd(densityCurves, dSup, t_vec)
den2Q_qd(densityCurves, dSup, t_vec)
densityCurves |
n-by-m matrix of density curves |
dSup |
length m vector contains the common support grid of the density curves |
t_vec |
common grid for quantile functions |
global F test for Wasserstein regression
globalFtest( wass_regress_res, alpha = 0.05, permutation = FALSE, numPermu = 200, bootstrap = FALSE, numBoot = 200 )
globalFtest( wass_regress_res, alpha = 0.05, permutation = FALSE, numPermu = 200, bootstrap = FALSE, numBoot = 200 )
wass_regress_res |
an object returned by the |
alpha |
type one error rate |
permutation |
logical; perform permutation global F test (default: FALSE) |
numPermu |
number of permutation samples if permutation = TRUE |
bootstrap |
logical; bootstrap global F test (default: FALSE) |
numBoot |
number of bootstrap samples if bootstrap = TRUE |
four methods used to compute p value of global F test
truncated: asymptotic inference, p-value is obtained by truncating the infinite summation of eigenvalues into the first K terms, where the first K terms explain more than 99.99% of the variance.
satterthwaite: asymptotic inference, p-value is computed using Satterthwaite's approximation method of mixtures of chi-square.
permutation: resampling technique; Wasserstein SSR is used as the F statistic.
bootstrap: resampling technique; Wasserstein SSR is used as the F statistic.
a list containing the following fields:
wasserstein.F_stat |
the Wasserstein F statistic value in Satterthwaite method . |
chisq_df |
the degree of freedom of the null chi-square distribution. |
summary_df |
a dataframe containing the following columns: |
method: methods used to compute p value, see details
statistic: the test statistics
critical_value: critical value
p_value: p value of global F test
data(strokeCTdensity) predictor = strokeCTdensity$predictors dSup = strokeCTdensity$densitySupport densityCurves = strokeCTdensity$densityCurve res = wass_regress(rightside_formula = ~., Xfit_df = predictor, Ytype = 'density', Ymat = densityCurves, Sup = dSup) globalF_res = globalFtest(res, alpha = 0.05, permutation = TRUE, numPermu = 200)
data(strokeCTdensity) predictor = strokeCTdensity$predictors dSup = strokeCTdensity$densitySupport densityCurves = strokeCTdensity$densityCurve res = wass_regress(rightside_formula = ~., Xfit_df = predictor, Ytype = 'density', Ymat = densityCurves, Sup = dSup) globalF_res = globalFtest(res, alpha = 0.05, permutation = TRUE, numPermu = 200)
partial F test for Wasserstein regression
partialFtest(reduced_res, full_res, alpha = 0.05)
partialFtest(reduced_res, full_res, alpha = 0.05)
reduced_res |
a reduced model list returned by the |
full_res |
a full model list returned by the |
alpha |
type one error rate |
two methods used to compute p value using asymptotic distribution of F statistic
truncated: asymptotic inference, p-value is obtained by truncating the infinite summation of eigenvalues into the first K terms, where the first K terms explain more than 99.99% of the variance.
satterthwaite: asymptotic inference, p-value is computed using Satterthwaite approximation method of mixtures of chi-square.
a dataframe containing the following columns:
method |
methods used to compute p value, see details |
statistic |
the test statistics |
critical_value |
critical value |
p_value |
p value of global F test |
data(strokeCTdensity) predictor = strokeCTdensity$predictors dSup = strokeCTdensity$densitySupport densityCurves = strokeCTdensity$densityCurve full_res <- wass_regress(rightside_formula = ~., Xfit_df = predictor, Ymat = densityCurves, Ytype = 'density', Sup = dSup) reduced_res <- wass_regress(~ log_b_vol + b_shapInd + midline_shift + B_TimeCT, Xfit_df = predictor, Ymat = densityCurves, Ytype = 'density', Sup = dSup) partialFtable = partialFtest(reduced_res, full_res, alpha = 0.05)
data(strokeCTdensity) predictor = strokeCTdensity$predictors dSup = strokeCTdensity$densitySupport densityCurves = strokeCTdensity$densityCurve full_res <- wass_regress(rightside_formula = ~., Xfit_df = predictor, Ymat = densityCurves, Ytype = 'density', Sup = dSup) reduced_res <- wass_regress(~ log_b_vol + b_shapInd + midline_shift + B_TimeCT, Xfit_df = predictor, Ymat = densityCurves, Ytype = 'density', Sup = dSup) partialFtable = partialFtest(reduced_res, full_res, alpha = 0.05)
a method of the WARp class which produces a one-step ahead prediction by WAR(p) models
## S3 method for class 'WARp' predict(object, dSup, expSup, ...)
## S3 method for class 'WARp' predict(object, dSup, expSup, ...)
object |
A WARp object, the output of |
dSup |
Optional, a numeric vector, the grid over which forecasted cdf/pdf is evaluated. Should be supplied/ignored with |
expSup |
Optional, a numeric vector, the grid over the Exponential map is applied, |
... |
Further arguments passed to or from other methods. |
A list of:
pred.cdf |
predicted cdf |
pred.pdf |
predicted pdf |
dSup |
support of the predicted cdf/pdf |
Wasserstein Autoregressive Models for Density Time Series, Chao Zhang, Piotr Kokoszka, Alexander Petersen, 2022
print the summary of WRI object
## S3 method for class 'summary.WRI' print(x, ...)
## S3 method for class 'summary.WRI' print(x, ...)
x |
a 'summary.WRI' object |
... |
further arguments passed to or from other methods. |
convert density function to quantile and quantile density function
quan2den_qd(quantileCurves, t_vec)
quan2den_qd(quantileCurves, t_vec)
quantileCurves |
n-by-m matrix of quantile curves |
t_vec |
length m vector contains the common support grid of the quantile curves |
This function simulates quantile curves used as a toy example
simulate_quantile_curves(x1, alpha, beta, t_vec)
simulate_quantile_curves(x1, alpha, beta, t_vec)
x1 |
n-by-1 predictor vector |
alpha |
parameter in location transformation |
beta |
parameter in variance transformation |
t_vec |
a length m vector - common grid for all quantile functions |
quan_obs n-by-m matrix of quantile functions
Wasserstein F-tests and confidence bands for the Frechet regression of density response curves, Alexander Petersen, Xi Liu and Afshin A. Divani, 2019
alpha = 2 beta = 1 n = 100 x1 = runif(n) t_vec = unique(c(seq(0, 0.05, 0.001), seq(0.05, 0.95, 0.05), seq(0.95, 1, 0.001))) quan_obs = simulate_quantile_curves(x1, alpha, beta, t_vec)
alpha = 2 beta = 1 n = 100 x1 = runif(n) t_vec = unique(c(seq(0, 0.05, 0.001), seq(0.05, 0.95, 0.05), seq(0.95, 1, 0.001))) quan_obs = simulate_quantile_curves(x1, alpha, beta, t_vec)
Stroke data: clinical, radiological scalar variables and density curves of the hematoma of 393 stroke patients
a list of the following three fields:
393-by-101 head CT hematoma densities as distributional response
length 101 common support vector
393-by-9 matrix containing 9 scalar predictors
Wasserstein F-tests and confidence bands for the Frechet regression of density response curves, Alexander Petersen, Xi Liu and Afshin A. Divani, 2019
Summary Function of Wasserstein Regression Model
## S3 method for class 'WRI' summary(object, ...)
## S3 method for class 'WRI' summary(object, ...)
object |
an object returned by the |
... |
further arguments passed to or from other methods. |
a list containing the following fields:
call |
function call of the Wasserstein regression |
r.square |
Wasserstein |
global_wasserstein_F_stat |
Wasserstein global F test statistic from the Satterthwaite method |
global_F_pvalue |
p value of global F test |
global_wasserstein_F_df |
degrees of freedom of satterthwaite approximated sampling distribution used in global F test |
partial_F_table |
Partial F test for individual effects |
data(strokeCTdensity) predictor = strokeCTdensity$predictors dSup = strokeCTdensity$densitySupport densityCurves = strokeCTdensity$densityCurve res <- wass_regress(rightside_formula = ~., Xfit_df = predictor, Ymat = densityCurves, Ytype = 'density', Sup = dSup) summary(res)
data(strokeCTdensity) predictor = strokeCTdensity$predictors dSup = strokeCTdensity$densitySupport densityCurves = strokeCTdensity$densityCurve res <- wass_regress(rightside_formula = ~., Xfit_df = predictor, Ymat = densityCurves, Ytype = 'density', Sup = dSup) summary(res)
this function produces an object of the WARp class which includes WAR(p) model parameter estimates and relevant quantities (see output list)
WARp(quantile, quantile.grid, p)
WARp(quantile, quantile.grid, p)
quantile |
A matrix containing all the sample quantile functions. Columns represent time indices and rows represent evaluation grid. |
quantile.grid |
A numeric vector, the grid over which quantile functions are evaluated. |
p |
A positive integer, the order of the fitted WAR(p) model. |
This function takes in a density time series in the form of the corresponding quantile functions as the main input. If the quantile series is not readily available, a general practice is to estimate density functions from samples, then use dens2quantile
from the fdadensity
package to convert density time series to quantile series.
A WARp
object of:
coef |
estimated AR parameters of the fitted WAR(p) model |
coef.cov |
covariance matrix of |
acvf |
Wasserstein autocovariance function values |
Wass.mean |
Wasserstein mean quantile function |
quantile |
a matrix containing all the sample quantile functions (columns represent time indices and rows represent evaluation grid) |
quantile.grid |
quantile function grid that is utilized in calculation |
order |
a positive integer, the order of the fitted WAR(p) model |
Wasserstein Autoregressive Models for Density Time Series, Chao Zhang, Piotr Kokoszka, Alexander Petersen, 2022
# Simulate a density time series represented in quantile functions # warSimData$sample.ts: A sample TS of quantile functions of length 100, taken from # the simulation experiments in Section 4 of Zhang et al. 2022. # warSimData$quantile.grid: The grid over which quantile functions in sample.ts are evaluated. warSimData <- warSim() p <- 3 dSup <- seq(-2, 2, 0.02) expSup <- seq(-2, 2, 0.1) # Estimation: fit a WAR(3) model WARp_obj <- WARp(warSimData$sample.ts, warSimData$quantile.grid, p) # Forecast: one-step-ahead forecast forecast_1 <- predict(WARp_obj) # dSup and expSup are chosen automatically forecast_2 <- predict(WARp_obj, dSup, expSup) # dSup and expSup are chosen by user # Plots par(mfrow=c(1,2)) plot(forecast_1$dSup, forecast_1$pred.cdf, type="l", xlab="dSup", ylab="cdf") plot(forecast_1$dSup, forecast_1$pred.pdf, type="l", xlab="dSup", ylab="pdf") plot(forecast_2$dSup, forecast_2$pred.cdf, type="l", xlab="dSup", ylab="cdf") plot(forecast_2$dSup, forecast_2$pred.pdf, type="l", xlab="dSup", ylab="pdf")
# Simulate a density time series represented in quantile functions # warSimData$sample.ts: A sample TS of quantile functions of length 100, taken from # the simulation experiments in Section 4 of Zhang et al. 2022. # warSimData$quantile.grid: The grid over which quantile functions in sample.ts are evaluated. warSimData <- warSim() p <- 3 dSup <- seq(-2, 2, 0.02) expSup <- seq(-2, 2, 0.1) # Estimation: fit a WAR(3) model WARp_obj <- WARp(warSimData$sample.ts, warSimData$quantile.grid, p) # Forecast: one-step-ahead forecast forecast_1 <- predict(WARp_obj) # dSup and expSup are chosen automatically forecast_2 <- predict(WARp_obj, dSup, expSup) # dSup and expSup are chosen by user # Plots par(mfrow=c(1,2)) plot(forecast_1$dSup, forecast_1$pred.cdf, type="l", xlab="dSup", ylab="cdf") plot(forecast_1$dSup, forecast_1$pred.pdf, type="l", xlab="dSup", ylab="pdf") plot(forecast_2$dSup, forecast_2$pred.cdf, type="l", xlab="dSup", ylab="cdf") plot(forecast_2$dSup, forecast_2$pred.pdf, type="l", xlab="dSup", ylab="pdf")
Generate WAR(p) simulation data sets: samples simulated from a WAR(3) model similar to the specification in Section 4 of the referenced paper.
warSim()
warSim()
A list of:
sample.ts |
one simulation run chosen from |
sample.ts.full |
1000 simulation runs, each of which consists of a sample time series (of length 100) of quantile functions generated by a WAR(3) model as specified by the reference paper |
quantile.grid |
the grid over which the quantile functions in sample.ts.full are evaluated |
Wasserstein Autoregressive Models for Density Time Series, Chao Zhang, Piotr Kokoszka, Alexander Petersen, 2022
Compute Wasserstein Coefficient of Determination
wass_R2(wass_regress_res)
wass_R2(wass_regress_res)
wass_regress_res |
an object returned by the |
Wasserstein , the Wasserstein coefficient of determination
Frechet regression for random objects with Euclidean predictors, Alexander Petersen and Hans-Georg Müller, 2019
data(strokeCTdensity) predictor = strokeCTdensity$predictors dSup = strokeCTdensity$densitySupport densityCurves = strokeCTdensity$densityCurve res = wass_regress(rightside_formula = ~., Xfit_df = predictor, Ymat = densityCurves, Ytype = 'density', Sup = dSup) wass_r2 = wass_R2(res)
data(strokeCTdensity) predictor = strokeCTdensity$predictors dSup = strokeCTdensity$densitySupport densityCurves = strokeCTdensity$densityCurve res = wass_regress(rightside_formula = ~., Xfit_df = predictor, Ymat = densityCurves, Ytype = 'density', Sup = dSup) wass_r2 = wass_R2(res)
Perform Frechet Regression with the Wasserstein Distance
wass_regress(rightside_formula, Xfit_df, Ytype, Ymat, Sup = NULL)
wass_regress(rightside_formula, Xfit_df, Ytype, Ymat, Sup = NULL)
rightside_formula |
a right-side formula |
Xfit_df |
n-by-p matrix (or dataframe) of predictor values for fitting (do not include a column for the intercept) |
Ytype |
'quantile' or 'density' |
Ymat |
one of the following matrices:
|
Sup |
one of the following vectors:
|
a list containing the following objects:
call |
function call |
rformula |
|
predictor_names |
names of predictors as the colnames given in the xfit matrix or dataframe. |
Qfit |
n-by-m matrix of fitted quantile functions. |
xfit |
design matrix in quantile fitting. |
Xfit_df |
n-by-p matrix (or dataframe) of predictor values for fitting |
Yobs |
a list containing the following matrices:
|
t_vec |
a length m vector - common grid for all quantile functions in Qobs. |
Wasserstein F-tests and confidence bands for the Frechet regression of density response curves, Alexander Petersen, Xi Liu and Afshin A. Divani, 2019
data(strokeCTdensity) predictor = strokeCTdensity$predictors dSup = strokeCTdensity$densitySupport densityCurves = strokeCTdensity$densityCurve res1 = wass_regress(rightside_formula = ~., Xfit_df = predictor, Ytype = 'density', Ymat = densityCurves, Sup = dSup) res2 = wass_regress(rightside_formula = ~ log_b_vol * weight, Xfit_df = predictor, Ytype = 'density', Ymat = densityCurves, Sup = dSup)
data(strokeCTdensity) predictor = strokeCTdensity$predictors dSup = strokeCTdensity$densitySupport densityCurves = strokeCTdensity$densityCurve res1 = wass_regress(rightside_formula = ~., Xfit_df = predictor, Ytype = 'density', Ymat = densityCurves, Sup = dSup) res2 = wass_regress(rightside_formula = ~ log_b_vol * weight, Xfit_df = predictor, Ytype = 'density', Ymat = densityCurves, Sup = dSup)