| Title: | Two-Sample Empirical Likelihood |
|---|---|
| Description: | Empirical likelihood (EL) inference for two-sample problems. The following statistics are included: the difference of two-sample means, smooth Huber estimators, quantile (qdiff) and cumulative distribution functions (fdiff), probability-probability (P-P) and quantile-quantile (Q-Q) plots as well as receiver operating characteristic (ROC) curves. Also includes two-sample block-wise empirical likelihood (BEL) and a frequency-domain empirical likelihood test for autocorrelation differences (FDEL). Methods for EL, P-P, Q-Q, ROC, qdiff and fdiff are based on Valeinis and Cers (2011) <http://home.lu.lv/~valeinis/lv/petnieciba/EL_TwoSample_2011.pdf>. |
| Authors: | Janis Valeinis [aut] (ORCID: <https://orcid.org/0000-0003-0989-0444>), Edmunds Cers [aut], Janis Gredzens [cre] (ORCID: <https://orcid.org/0009-0009-4890-4897>), Reinis Alksnis [ctb] (ORCID: <https://orcid.org/0000-0002-5512-5463>) |
| Maintainer: | Janis Gredzens <[email protected]> |
| License: | GPL (>= 2) |
| Version: | 1.4 |
| Built: | 2026-05-12 21:21:58 UTC |
| Source: | https://github.com/cran/EL |
Calculates blockwise empirical likelihood test for the difference of two sample means.
BEL.means(X, Y, M_1, M_2, Delta = 0)BEL.means(X, Y, M_1, M_2, Delta = 0)
X, Y
|
vectors of data values. |
M_1, M_2
|
positive integers specifying block length for X and Y, respectively. |
Delta |
hypothesized difference of two populations. |
A list of class "htest" containing following components: method - the character string of the test. data.name - a character string with the names of the input data. Delta0 - the specified hypothesized value of mean differences under the null hypothesis statistic - the value of the test statistic. p.value - the p-value for the test.
R. Alksnis, J. Valeinis
# Basic example Delta0 <- 1.5 X <- arima.sim(n = 400, model = list(ar = .3)) Y <- arima.sim(n = 400, model = list(ar = .5)) + Delta0 BEL.means(X, Y, 10, 20, Delta = Delta0)# Basic example Delta0 <- 1.5 X <- arima.sim(n = 400, model = list(ar = .3)) Y <- arima.sim(n = 400, model = list(ar = .5)) + Delta0 BEL.means(X, Y, 10, 20, Delta = Delta0)
Empirical likelihood inference for the difference of smoothed Huber estimators. This includes a test for the null hypothesis of a constant difference of smoothed Huber estimators, a confidence interval, and the EL estimator.
EL.Huber( X, Y, mu = 0, conf.level = 0.95, scaleX = 1, scaleY = 1, VX = 2.046, VY = 2.046, k = 1.35 )EL.Huber( X, Y, mu = 0, conf.level = 0.95, scaleX = 1, scaleY = 1, VX = 2.046, VY = 2.046, k = 1.35 )
X |
A numeric vector of data values. |
Y |
A numeric vector of data values. |
mu |
A number specifying the null hypothesis value for the difference.
Default is |
conf.level |
Confidence level for the reported
confidence interval (default |
scaleX |
The scale estimate of sample |
scaleY |
The scale estimate of sample |
VX |
The asymptotic variance of the initial (nonsmooth) Huber estimator
for sample |
VY |
The asymptotic variance of the initial (nonsmooth) Huber estimator
for sample |
k |
Tuning parameter for the Huber estimator. Default is |
A common choice for a robust scale estimate (parameters scaleX and
scaleY) is the median absolute deviation (MAD).
A list of class "htest" with components:
estimateThe empirical likelihood estimate for the difference of two smoothed Huber estimators.
conf.intA confidence interval for the difference of two smoothed Huber estimators.
p.valueThe p-value for the test.
statisticThe value of the test statistic.
methodThe character string
"Empirical likelihood smoothed Huber estimator difference test".
null.valueThe hypothesised difference mu under .
data.nameA character string giving the names of the data.
E. Cers, J. Valeinis
Valeinis, J. and Cers, E. Extending the two-sample empirical likelihood. Preprint: http://home.lu.lv/~valeinis/lv/petnieciba/EL_TwoSample_2011.pdf
Hampel, F., Hennig, C. and Ronchetti, E. A. (2011). A smoothing principle for the Huber and other location M-estimators. Computational Statistics & Data Analysis, 55(1), 324–337.
X <- rnorm(100) Y <- rnorm(100) t.test(X, Y) EL.means(X, Y) EL.Huber(X, Y)X <- rnorm(100) Y <- rnorm(100) t.test(X, Y) EL.means(X, Y) EL.Huber(X, Y)
Empirical likelihood inference for the difference of two sample means. This includes a test for the null hypothesis of a constant mean difference, a confidence interval, and the EL estimator.
EL.means(X, Y, mu = 0, conf.level = 0.95)EL.means(X, Y, mu = 0, conf.level = 0.95)
X |
A numeric vector of data values. |
Y |
A numeric vector of data values. |
mu |
A number specifying the null hypothesis value for the mean
difference. Default is |
conf.level |
Confidence level for the reported
confidence interval (default |
A list of class "htest" with components:
estimateThe empirical likelihood estimate of the mean difference.
conf.intA confidence interval for the mean difference.
p.valueThe p-value for the test.
statisticThe value of the test statistic.
methodThe character string "Empirical likelihood mean difference test".
null.valueThe hypothesised mean difference mu under .
data.nameA character string giving the names of the data.
E. Cers, J. Valeinis
Valeinis, J., Cers, E. and Cielens, J. (2010). Two-sample problems in statistical data modelling. Mathematical Modelling and Analysis, 15(1), 137–151.
Valeinis, J. and Cers, E. Extending the two-sample empirical likelihood. Preprint: http://home.lu.lv/~valeinis/lv/petnieciba/EL_TwoSample_2011.pdf
X <- rnorm(100) Y <- rnorm(100) t.test(X, Y) EL.means(X, Y) EL.Huber(X, Y)X <- rnorm(100) Y <- rnorm(100) t.test(X, Y) EL.means(X, Y) EL.Huber(X, Y)
Draws P-P and Q-Q plots, ROC curves, quantile differences (qdiff) and
CDF differences (ddiff) and their respective confidence bands (pointwise
or simultaneous) using the empirical likelihood method.
EL.plot( method, X, Y, bw = bw.nrd0, conf.level = NULL, simultaneous = FALSE, bootstrap.samples = 300, more.warnings = FALSE, ... )EL.plot( method, X, Y, bw = bw.nrd0, conf.level = NULL, simultaneous = FALSE, bootstrap.samples = 300, more.warnings = FALSE, ... )
method |
One of |
X |
A numeric vector of data values. |
Y |
A numeric vector of data values. |
bw |
A function taking a numeric vector and returning a bandwidth, or a
numeric vector of length two giving the bandwidths for |
conf.level |
Confidence level for the intervals, a number in |
simultaneous |
Logical. If |
bootstrap.samples |
Integer. Number of bootstrap samples used when
|
more.warnings |
Logical. If |
... |
Further arguments passed to the plot function. |
The plotting interval for P-P plots, ROC curves and differences of quantile
functions is . The Q-Q plot is drawn from the minimum to the
maximum of Y. For the plot of distribution function differences the
interval from to
is used.
Confidence bands are drawn only if conf.level is not NULL.
When constructing simultaneous confidence bands, the plot is drawn on an
interval narrowed by 5% on both sides, since the procedure is usually
sensitive at the endpoints. The confidence level is bootstrapped using 50
evenly spaced points in this interval. If the default interval produces
excessively wide bands, use EL.smooth where intervals are
specified manually. Note that calculation of simultaneous confidence bands
can be slow.
A ggplot object. The plot can be further customized using
ggplot2 functions, as shown in the examples.
E. Cers, J. Valeinis
Valeinis, J. and Cers, E. Extending the two-sample empirical likelihood. Preprint: http://home.lu.lv/~valeinis/lv/petnieciba/EL_TwoSample_2011.pdf
Hall, P. and Owen, A. (1993). Empirical likelihood bands in density estimation. Journal of Computational and Graphical Statistics, 2(3), 273–289.
## The examples showcase all available graphs set.seed(42) X1 <- rnorm(100, 0.5, 1.5) X2 <- rnorm(100, 0, 1) xlim <- c(min(X1, X2) - 0.5, max(X1, X2) + 0.5) D1 <- density(X1) D2 <- density(X2) df <- data.frame(x1 = D1$x, y1 = D1$y, x2 = D2$x, y2 = D2$y) p1 <- ggplot2::ggplot(data = df) + ggplot2::geom_line(ggplot2::aes(x = x2, y = y2, color = paste0('X2 (bw=', round(D2$bw, 2), ')'))) + ggplot2::geom_line(ggplot2::aes(x = x1, y = y1, color = paste0('X1 (bw=', round(D1$bw, 2), ')'))) + ggplot2::guides(color = ggplot2::guide_legend(title = NULL)) + ggplot2::theme_minimal() + ggplot2::theme(legend.position = "top") + ggplot2::labs(x = "X", y = "Density") p1 # CDF differences p2 <- EL.plot("fdiff", X1, X2, main = "F difference", conf.level = 0.95) tt <- seq(max(c(min(X1), min(X2))), min(c(max(X1), max(X2))), length = 30) ee <- ecdf(X2)(tt) - ecdf(X1)(tt) p2 <- p2 + ggplot2::geom_point(data = data.frame(tt = tt, ee = ee), ggplot2::aes(x = tt, y = ee)) p2 # Quantile differences p3 <- EL.plot("qdiff", X1, X2, main = "Quantile difference", conf.level = 0.95) tt <- seq(0.01, 0.99, length = 30) ee <- quantile(X2, tt) - quantile(X1, tt) p3 <- p3 + ggplot2::geom_point(data = data.frame(tt = tt, ee = ee), ggplot2::aes(x = tt, y = ee)) p3 # Q-Q plot p4 <- EL.plot("qq", X1, X2, main = "Q-Q plot", conf.level = 0.95) tt <- seq(min(X2), max(X2), length = 30) ee <- quantile(X1, ecdf(X2)(tt)) p4 <- p4 + ggplot2::geom_point(data = data.frame(tt = tt, ee = ee), ggplot2::aes(x = tt, y = ee)) p4 # P-P plot p5 <- EL.plot("pp", X1, X2, main = "P-P plot", conf.level = 0.95, ylim = c(0, 1)) tt <- seq(0.01, 0.99, length = 30) ee <- ecdf(X1)(quantile(X2, tt)) p5 <- p5 + ggplot2::geom_point(data = data.frame(tt = tt, ee = ee), ggplot2::aes(x = tt, y = ee)) p5 # ROC curve p6 <- EL.plot("roc", X1, X2, main = "ROC curve", conf.level = 0.95, ylim = c(0, 1)) tt <- seq(0.01, 0.99, length = 30) ee <- 1 - ecdf(X1)(quantile(X2, 1 - tt)) p6 <- p6 + ggplot2::geom_point(data = data.frame(tt = tt, ee = ee), ggplot2::aes(x = tt, y = ee)) p6 # To show all plots at once: # require(cowplot) # cowplot::plot_grid(p1, p2, p3, p4, p5, p6, ncol = 2)## The examples showcase all available graphs set.seed(42) X1 <- rnorm(100, 0.5, 1.5) X2 <- rnorm(100, 0, 1) xlim <- c(min(X1, X2) - 0.5, max(X1, X2) + 0.5) D1 <- density(X1) D2 <- density(X2) df <- data.frame(x1 = D1$x, y1 = D1$y, x2 = D2$x, y2 = D2$y) p1 <- ggplot2::ggplot(data = df) + ggplot2::geom_line(ggplot2::aes(x = x2, y = y2, color = paste0('X2 (bw=', round(D2$bw, 2), ')'))) + ggplot2::geom_line(ggplot2::aes(x = x1, y = y1, color = paste0('X1 (bw=', round(D1$bw, 2), ')'))) + ggplot2::guides(color = ggplot2::guide_legend(title = NULL)) + ggplot2::theme_minimal() + ggplot2::theme(legend.position = "top") + ggplot2::labs(x = "X", y = "Density") p1 # CDF differences p2 <- EL.plot("fdiff", X1, X2, main = "F difference", conf.level = 0.95) tt <- seq(max(c(min(X1), min(X2))), min(c(max(X1), max(X2))), length = 30) ee <- ecdf(X2)(tt) - ecdf(X1)(tt) p2 <- p2 + ggplot2::geom_point(data = data.frame(tt = tt, ee = ee), ggplot2::aes(x = tt, y = ee)) p2 # Quantile differences p3 <- EL.plot("qdiff", X1, X2, main = "Quantile difference", conf.level = 0.95) tt <- seq(0.01, 0.99, length = 30) ee <- quantile(X2, tt) - quantile(X1, tt) p3 <- p3 + ggplot2::geom_point(data = data.frame(tt = tt, ee = ee), ggplot2::aes(x = tt, y = ee)) p3 # Q-Q plot p4 <- EL.plot("qq", X1, X2, main = "Q-Q plot", conf.level = 0.95) tt <- seq(min(X2), max(X2), length = 30) ee <- quantile(X1, ecdf(X2)(tt)) p4 <- p4 + ggplot2::geom_point(data = data.frame(tt = tt, ee = ee), ggplot2::aes(x = tt, y = ee)) p4 # P-P plot p5 <- EL.plot("pp", X1, X2, main = "P-P plot", conf.level = 0.95, ylim = c(0, 1)) tt <- seq(0.01, 0.99, length = 30) ee <- ecdf(X1)(quantile(X2, tt)) p5 <- p5 + ggplot2::geom_point(data = data.frame(tt = tt, ee = ee), ggplot2::aes(x = tt, y = ee)) p5 # ROC curve p6 <- EL.plot("roc", X1, X2, main = "ROC curve", conf.level = 0.95, ylim = c(0, 1)) tt <- seq(0.01, 0.99, length = 30) ee <- 1 - ecdf(X1)(quantile(X2, 1 - tt)) p6 <- p6 + ggplot2::geom_point(data = data.frame(tt = tt, ee = ee), ggplot2::aes(x = tt, y = ee)) p6 # To show all plots at once: # require(cowplot) # cowplot::plot_grid(p1, p2, p3, p4, p5, p6, ncol = 2)
Calculates estimates and pointwise confidence intervals (or simultaneous bands)
for P-P and Q-Q plots, ROC curves, quantile differences (qdiff) and CDF
differences (ddiff) using the smoothed empirical likelihood method.
EL.smooth( method, X, Y, t, bw = bw.nrd0, conf.level = NULL, simultaneous = FALSE, bootstrap.samples = 300, more.warnings = FALSE )EL.smooth( method, X, Y, t, bw = bw.nrd0, conf.level = NULL, simultaneous = FALSE, bootstrap.samples = 300, more.warnings = FALSE )
method |
One of |
X |
A numeric vector of data values. |
Y |
A numeric vector of data values. |
t |
A numeric vector of points at which to calculate estimates and confidence intervals. |
bw |
A function taking a numeric vector and returning a bandwidth, or a
numeric vector of length two giving the bandwidths for |
conf.level |
Confidence level for the intervals, a number in |
simultaneous |
Logical. If |
bootstrap.samples |
Integer. Number of bootstrap samples used when
|
more.warnings |
Logical. If |
Confidence bands are drawn only if conf.level is not NULL.
When constructing simultaneous confidence bands, check that the chosen range
of t values does not produce excessively wide bands (for example, for
a P-P plot the interval is typically a sensible choice).
This should be verified for each dataset. Note that simultaneous band
calculation can be slow.
A list with components:
estimateEstimated values at points t.
conf.intA two-row matrix where each column gives the lower
and upper confidence bounds at the corresponding point in t.
simultaneous.conf.intLogical; TRUE if simultaneous
bands were constructed.
bootstrap.critThe bootstrap critical value of the
-likelihood statistic for simultaneous bands at level
conf.level. Only present when conf.level is not NULL
and simultaneous = TRUE.
E. Cers, J. Valeinis
Valeinis, J. and Cers, E. Extending the two-sample empirical likelihood. Preprint: http://home.lu.lv/~valeinis/lv/petnieciba/EL_TwoSample_2011.pdf
Hall, P. and Owen, A. (1993). Empirical likelihood bands in density estimation. Journal of Computational and Graphical Statistics, 2(3), 273–289.
#### Simultaneous confidence bands for a P-P plot X1 <- rnorm(200) X2 <- rnorm(200, 1) x <- seq(0.05, 0.95, length = 19) y <- EL.smooth("pp", X1, X2, x, conf.level = 0.95, simultaneous = TRUE, bw = c(0.3, 0.3)) conf.int <- data.frame(x = x, ci.l = y$conf.int[1,], ci.u = y$conf.int[2,]) ## Plot with both pointwise and simultaneous confidence bands EL.plot("pp", X1, X2, conf.level = 0.95, bw = c(0.3, 0.3)) + ggplot2::geom_line(data = conf.int, ggplot2::aes(x = x, y = ci.u), lty = "dotted") + ggplot2::geom_line(data = conf.int, ggplot2::aes(x = x, y = ci.l), lty = "dotted")#### Simultaneous confidence bands for a P-P plot X1 <- rnorm(200) X2 <- rnorm(200, 1) x <- seq(0.05, 0.95, length = 19) y <- EL.smooth("pp", X1, X2, x, conf.level = 0.95, simultaneous = TRUE, bw = c(0.3, 0.3)) conf.int <- data.frame(x = x, ci.l = y$conf.int[1,], ci.u = y$conf.int[2,]) ## Plot with both pointwise and simultaneous confidence bands EL.plot("pp", X1, X2, conf.level = 0.95, bw = c(0.3, 0.3)) + ggplot2::geom_line(data = conf.int, ggplot2::aes(x = x, y = ci.u), lty = "dotted") + ggplot2::geom_line(data = conf.int, ggplot2::aes(x = x, y = ci.l), lty = "dotted")
Calculates times the log-likelihood ratio statistic when the function
of interest (either of P-P or Q-Q plot, ROC curve, difference of quantile or
distribution functions) at some point t is equal to d.
EL.statistic(method, X, Y, Delta, d, t, bw = bw.nrd0, conf.level = 0.95)EL.statistic(method, X, Y, Delta, d, t, bw = bw.nrd0, conf.level = 0.95)
method |
One of |
X |
A numeric vector of data values. |
Y |
A numeric vector of data values. |
Delta |
A number. The hypothesised value of the function at point |
d |
Deprecated. Use |
t |
A number. The point at which to evaluate the statistic. |
bw |
A function taking a numeric vector and returning a bandwidth, or a
numeric vector of length two giving the bandwidths for |
conf.level |
Confidence level for the reported
confidence interval (default |
An object of class "htest".
E. Cers, J. Valeinis
Valeinis, J. and Cers, E. Extending the two-sample empirical likelihood. Preprint: http://home.lu.lv/~valeinis/lv/petnieciba/EL_TwoSample_2011.pdf
EL.statistic(method = "pp", X = rnorm(100), Y = rnorm(100), Delta = 0.5, t = 0.5)EL.statistic(method = "pp", X = rnorm(100), Y = rnorm(100), Delta = 0.5, t = 0.5)
Tests whether the lag- autocorrelation of two
independent stationary time series differ by a specified
amount .
FDEL.acf( X, Y, Delta = 0, lag = 1, bartlett = FALSE, bootstrap.samples = 500, center = TRUE, seed = NULL, span = 0.15, rho.lower = -0.99, rho.upper = 0.99 )FDEL.acf( X, Y, Delta = 0, lag = 1, bartlett = FALSE, bootstrap.samples = 500, center = TRUE, seed = NULL, span = 0.15, rho.lower = -0.99, rho.upper = 0.99 )
X, Y
|
Numeric time-series vectors (length >= 10). |
Delta |
Hypothesised difference
|
lag |
Positive integer lag |
bartlett |
Logical; apply a bootstrap Bartlett
correction? (default |
bootstrap.samples |
Number of bootstrap replicates for the Bartlett
correction (used only when |
center |
Logical; subtract sample means before
computing periodograms? (default |
seed |
Optional integer seed for reproducibility. |
span |
Loess span for periodogram smoothing used in the Bartlett correction (default 0.15). |
rho.lower, rho.upper
|
Search bounds for the profile
autocorrelation optimization (defaults |
An object of class c("FDELacf", "htest")
with components statistic, parameter,
p.value, estimate, null.value,
alternative, method, data.name,
lag, bartlett, bartlett.factor,
statistic.uncorrected, p.value.uncorrected,
B, call.
R. Alksnis, J. Valeinis
set.seed(1) X <- arima.sim(n = 200, model = list(ar = 0.3)) Y <- arima.sim(n = 350, model = list(ar = 0.3)) FDEL.acf(X, Y) ## With Bartlett correction (slower) ## FDEL.acf(X, Y, bartlett = TRUE, B = 199)set.seed(1) X <- arima.sim(n = 200, model = list(ar = 0.3)) Y <- arima.sim(n = 350, model = list(ar = 0.3)) FDEL.acf(X, Y) ## With Bartlett correction (slower) ## FDEL.acf(X, Y, bartlett = TRUE, B = 199)