| Title: | Exact Post Selection Inference with Applications to the Lasso |
|---|---|
| Description: | Implements the conditional estimation procedure of Lee, Sun, Sun and Taylor (2016) <doi:10.1214/15-AOS1371>. This procedure allows hypothesis testing on the mean of a normal random vector subject to linear constraints. Also supports computation of the MLE of the mean subject to the same constraints. |
| Authors: | Steven E. Pav [aut, cre] (ORCID: <https://orcid.org/0000-0002-4197-6195>) |
| Maintainer: | Steven E. Pav <[email protected]> |
| License: | LGPL-3 |
| Version: | 0.2.0 |
| Built: | 2026-06-10 07:12:28 UTC |
| Source: | https://github.com/cran/epsiwal |
Confidence intervals on normal mean, subject to linear constraints.
ci_connorm( y, A, b, eta, Sigma = NULL, p = c(level/2, 1 - (level/2)), level = 0.05, Sigma_eta = Sigma %*% eta )ci_connorm( y, A, b, eta, Sigma = NULL, p = c(level/2, 1 - (level/2)), level = 0.05, Sigma_eta = Sigma %*% eta )
y |
an |
A |
an |
b |
a |
eta |
an |
Sigma |
an |
p |
a vector of probabilities for which we return
equivalent |
level |
if |
Sigma_eta |
an |
Inverts the constrained normal inference procedure described by Lee et al.
Let be multivariate normal with unknown mean
and known covariance . Conditional on
for conformable matrix and vector , and given
constrast vector and level , we compute
such that the cumulative distribution of
equals .
The values of which have the corresponding
CDF.
An error will be thrown if we do not observe .
Steven E. Pav [email protected]
Lee, J. D., Sun, D. L., Sun, Y. and Taylor, J. E. "Exact post-selection inference, with application to the Lasso." Ann. Statist. 44, no. 3 (2016): 907-927. doi:10.1214/15-AOS1371. https://arxiv.org/abs/1311.6238
the CDF function, pconnorm, the MLE function, mle_connorm,
the special case code for conditioning on the max, ci_connorm_max
set.seed(1234) n <- 10 y <- rnorm(n) A <- matrix(rnorm(n*(n-3)),ncol=n) b <- A%*%y + runif(nrow(A)) Sigma <- diag(runif(n)) mu <- rnorm(n) eta <- rnorm(n) pval <- pconnorm(y=y,A=A,b=b,eta=eta,mu=mu,Sigma=Sigma) cival <- ci_connorm(y=y,A=A,b=b,eta=eta,Sigma=Sigma,p=pval) stopifnot(abs(cival - sum(eta*mu)) < 1e-4)set.seed(1234) n <- 10 y <- rnorm(n) A <- matrix(rnorm(n*(n-3)),ncol=n) b <- A%*%y + runif(nrow(A)) Sigma <- diag(runif(n)) mu <- rnorm(n) eta <- rnorm(n) pval <- pconnorm(y=y,A=A,b=b,eta=eta,mu=mu,Sigma=Sigma) cival <- ci_connorm(y=y,A=A,b=b,eta=eta,Sigma=Sigma,p=pval) stopifnot(abs(cival - sum(eta*mu)) < 1e-4)
Confidence intervals on normal mean, conditioning on the max.
ci_connorm_max( yk, yk1, sigma = 1, rho = 0, p = c(level/2, 1 - (level/2)), level = 0.05 )ci_connorm_max( yk, yk1, sigma = 1, rho = 0, p = c(level/2, 1 - (level/2)), level = 0.05 )
yk |
the observed maximum value, |
yk1 |
a vector of the other observed values, |
sigma |
the common standard deviation. |
rho |
the common correlation. |
p |
a vector of probabilities for which we return
equivalent |
level |
if |
Computes the confidence interval of unknown mean of a normal vector conditional on the one element being the maximum.
Let be multivariate normal with unknown mean
and known covariance . We assume that
is compound symmetric with common variance and
common correlation .
Conditional on for all ,
we compute the confidence interval of .
The values of which have the corresponding
CDF.
Steven E. Pav [email protected]
Lee, J. D., Sun, D. L., Sun, Y. and Taylor, J. E. "Exact post-selection inference, with application to the Lasso." Ann. Statist. 44, no. 3 (2016): 907-927. doi:10.1214/15-AOS1371. https://arxiv.org/abs/1311.6238
the CDF function, pconnorm, the MLE function, mle_connorm_max,
the more general version, ci_connorm.
Exact Post Selection Inference with Applications to the Lasso.
This simple package supports the simple procedure outlined in Lee et al. where one observes a normal random variable, then performs inference conditional on some linear inequalities.
Suppose is multivariate normal with mean
and covariance . Conditional on ,
one can perform inference on by
transforming to a truncated normal.
Similarly one can invert this procedure and find confidence intervals on
.
epsiwal is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
This package is maintained as a hobby.
Steven E. Pav [email protected]
Maintainer: Steven E. Pav [email protected] (ORCID)
Lee, J. D., Sun, D. L., Sun, Y. and Taylor, J. E. "Exact post-selection inference, with application to the Lasso." Ann. Statist. 44, no. 3 (2016): 907-927. doi:10.1214/15-AOS1371. https://arxiv.org/abs/1311.6238
Pav, S. E. "Conditional inference on the asset with maximum Sharpe ratio." Arxiv e-print (2019). https://arxiv.org/abs/1906.00573
Pav, S. E. "Post selection estimation of Sharpe ratios." Arxiv e-print (2026). https://arxiv.org/abs/2606.01650
Useful links:
News for package ‘epsiwal’
fix numerical stability issues in ptruncnorm and downstream utilities (CIs).
add MLE estimator of Reid, Taylor, Tibshirani.
add helper functions for the case of conditioning on the max of a vector.
first CRAN release.
Maximum likelihood estimate of normal mean, subject to linear constraints.
mle_connorm(y, A, b, eta, Sigma = NULL, Sigma_eta = Sigma %*% eta, ...)mle_connorm(y, A, b, eta, Sigma = NULL, Sigma_eta = Sigma %*% eta, ...)
y |
an |
A |
an |
b |
a |
eta |
an |
Sigma |
an |
Sigma_eta |
an |
... |
dots are passed to |
Computes the maximum likelihood estimate of unknown mean of a normal vector conditional on linear constraints.
Let be multivariate normal with unknown mean
and known covariance . Conditional on
for conformable matrix and vector , and given
constrast vector , we compute
the maximum likelihood estimate of .
The maximum likelihood estimate of .
Steven E. Pav [email protected]
Reid, S., Taylor, J. and Tibshirani, R. "Post-selection point and interval estimation of signal sizes in Gaussian samples." Can. J. Statistics. 45, no. 2 (2017): 128-148. doi:10.1002/cjs.11320. https://arxiv.org/abs/1405.3340
the confidence interval function, ci_connorm,
the CDF function, pconnorm,
the special case code for conditioning on the max, mle_connorm_max
set.seed(1234) n <- 10 y <- rnorm(n) A <- matrix(rnorm(n*(n-3)),ncol=n) b <- A%*%y + runif(nrow(A)) Sigma <- diag(runif(n)) mu <- rnorm(n) eta <- rnorm(n) mval <- mle_connorm(y=y,A=A,b=b,eta=eta,Sigma=Sigma) # try again, but control tolerance: mval <- mle_connorm(y=y,A=A,b=b,eta=eta,Sigma=Sigma,tol=1e-8)set.seed(1234) n <- 10 y <- rnorm(n) A <- matrix(rnorm(n*(n-3)),ncol=n) b <- A%*%y + runif(nrow(A)) Sigma <- diag(runif(n)) mu <- rnorm(n) eta <- rnorm(n) mval <- mle_connorm(y=y,A=A,b=b,eta=eta,Sigma=Sigma) # try again, but control tolerance: mval <- mle_connorm(y=y,A=A,b=b,eta=eta,Sigma=Sigma,tol=1e-8)
Maximum likelihood estimate of normal mean, conditioning on the max.
mle_connorm_max(yk, yk1, sigma = 1, rho = 0, ...)mle_connorm_max(yk, yk1, sigma = 1, rho = 0, ...)
yk |
the observed maximum value, |
yk1 |
a vector of the other observed values, |
sigma |
the common standard deviation. |
rho |
the common correlation. |
... |
dots are passed to |
Computes the maximum likelihood estimate of unknown mean of a normal vector conditional on the one element being the maximum.
Let be multivariate normal with unknown mean
and known covariance . We assume that
is compound symmetric with common variance and
common correlation .
Conditional on for all ,
we compute the maximum likelihood estimate of .
The maximum likelihood estimate of .
Steven E. Pav [email protected]
Reid, S., Taylor, J. and Tibshirani, R. "Post-selection point and interval estimation of signal sizes in Gaussian samples." Can. J. Statistics. 45, no. 2 (2017): 128-148. doi:10.1002/cjs.11320. https://arxiv.org/abs/1405.3340
the confidence interval function, ci_connorm_max,
the CDF function, pconnorm,
the more general version, mle_connorm.
CDF of the conditional normal variate.
pconnorm( y, A, b, eta, mu = NULL, Sigma = NULL, Sigma_eta = Sigma %*% eta, eta_mu = as.numeric(t(eta) %*% mu), lower.tail = TRUE, log.p = FALSE )pconnorm( y, A, b, eta, mu = NULL, Sigma = NULL, Sigma_eta = Sigma %*% eta, eta_mu = as.numeric(t(eta) %*% mu), lower.tail = TRUE, log.p = FALSE )
y |
an |
A |
an |
b |
a |
eta |
an |
mu |
an |
Sigma |
an |
Sigma_eta |
an |
eta_mu |
the scalar |
lower.tail |
logical; if TRUE (default), probabilities are P[X <= x] otherwise, P[X > x]. |
log.p |
logical; if TRUE, probabilities p are returned as log(p). |
Computes the CDF of the truncated normal conditional on linear constraints, as described in section 5 of Lee et al.
Let be multivariate normal with mean
and covariance . Conditional on
for conformable matrix and vector we compute the
CDF of a truncated normal maximally aligned with .
Inference depends on the population parameters only via
and ,
and only these need to be given.
The test statistic is aligned with , meaning that an output
p-value near one casts doubt on the null hypothesis that
is less than the posited value.
The CDF.
An error will be thrown if we do not observe .
Steven E. Pav [email protected]
Lee, J. D., Sun, D. L., Sun, Y. and Taylor, J. E. "Exact post-selection inference, with application to the Lasso." Ann. Statist. 44, no. 3 (2016): 907-927. doi:10.1214/15-AOS1371. https://arxiv.org/abs/1311.6238
the confidence interval function, ci_connorm,
the MLE function, mle_connorm.
CDF of the conditional normal variate, conditioning on the max.
pconnorm_max( yk, yk1, mu_k, sigma = 1, rho = 0, lower.tail = TRUE, log.p = FALSE )pconnorm_max( yk, yk1, mu_k, sigma = 1, rho = 0, lower.tail = TRUE, log.p = FALSE )
yk |
the observed maximum value, |
yk1 |
a vector of the other observed values, |
mu_k |
the scalar mean of the maximal element |
sigma |
the common standard deviation. |
rho |
the common correlation. |
lower.tail |
logical; if TRUE (default), probabilities are P[X <= x] otherwise, P[X > x]. |
log.p |
logical; if TRUE, probabilities p are returned as log(p). |
Computes the CDF of the conditional maximum of a normal vector
using the truncated normal from the polyhedral lemma.
Let be multivariate normal where the maximal observed element
is known to have mean , and the vector has known covariance .
We assume that is compound symmetric with common variance and
common correlation .
Conditional on for all ,
we compute the CDF of
The CDF.
Steven E. Pav [email protected]
Lee, J. D., Sun, D. L., Sun, Y. and Taylor, J. E. "Exact post-selection inference, with application to the Lasso." Ann. Statist. 44, no. 3 (2016): 907-927. doi:10.1214/15-AOS1371. https://arxiv.org/abs/1311.6238
the general CDF function, pconnorm, the MLE function, mle_connorm_max,
the confidence interval function, ci_connorm_max.
Cumulative distribution of the truncated normal function.
ptruncnorm( q, mean = 0, sd = 1, a = -Inf, b = Inf, lower.tail = TRUE, log.p = FALSE )ptruncnorm( q, mean = 0, sd = 1, a = -Inf, b = Inf, lower.tail = TRUE, log.p = FALSE )
q |
vector of quantiles, |
mean |
vector of means. |
sd |
vector of standard deviations. |
a |
vector of the left truncation value(s). |
b |
vector of the right truncation value(s). |
lower.tail |
logical; if TRUE (default), probabilities are P[X <= x] otherwise, P[X > x]. |
log.p |
logical; if TRUE, probabilities p are returned as log(p). |
The distribution function of the truncated normal.
Invalid arguments will result in return value NaN with a warning.
Input are recycled as possible.
Steven E. Pav [email protected]
Hattaway, James T. "Parameter estimation and hypothesis testing for the truncated normal distribution with applications to introductory statistics grades." BYU Masters Thesis (2010). https://scholarsarchive.byu.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=3052&context=etd
y <- ptruncnorm(seq(-5,5,length.out=101), a=-1, b=2)y <- ptruncnorm(seq(-5,5,length.out=101), a=-1, b=2)