Probability Distribution Functions in Package qfratio

Since version 1.1.0, qfratio (CRAN; GitHub) has a functionality to evaluate probability density and distribution functions of a (simple) ratio of quadratic forms in normal variables. This document is to describe theoretical backgrounds and some implementation details of this functionality. See the main package vignette (vignette("qfratio")) for the evaluation of moments of ratios of quadratic forms.

Symbols used

  • n: number of variables
  • C: (top-order) zonal and invariant polynomials of matrix arguments (Chikuse, 1980, 1987; Davis, 1980; Muirhead, 1982)
  • Q, q: ratio of quadratic forms as a random variable Q and its realized value or quantile q
  • FQ(q), fQ(q): (cumulative) distribution FQ and probability density fQ functions of Q at q
  • x: n-variate normal random vector
  • $\qfrmvnorm{\boldsymbol{\mu}}{\boldsymbol{\Sigma}}$: n-variate normal distribution with mean vector μ and covariance matrix Σ
  • $\qfrnchisq{h}{\delta^2}$: noncentral chi-square distribution with h degrees of freedom and noncentrality parameter δ2
  • A, B: n × n argument matrices
  • In: n-dimensional identity matrix
  • 0n: n-variate vector of 0’s
  • 0n × n: n × n matrix of 0’s
  • $\qfrE \left( \cdot \right)$: expectation/mean
  • $\qfrGmf{\cdot}$: gamma function
  • $\qfrBtf{\cdot}{\cdot}$: beta function
  • $\qfrrf{x}$: Pochhammer symbol, that is, $\qfrrf{x} = x (x + 1) \dots (x + k - 1)$ (with the convention $\qfrrf[0]{x} = 1$)
  • $\qfrhgmf{a}{b}{c}{x}$: (Gauss) hypergeometric function, $\qfrhgmf{a}{b}{c}{x} = \sum_{k=0}^{\infty} \frac{ \qfrrf{a} \qfrrf{b} }{ \qfrrf{c} k! } x^k$
  • AT: matrix transposition
  • A−1: matrix inverse
  • det A: matrix determinant
  • $\qfrtr{\mathbf{A}}$: matrix trace
  • $\boldsymbol{\Lambda} = \qfrdiag \left( \lambda_1 , \dots , \lambda_n \right)$: matrix of eigenvalues of A − qB
  • P: matrix of corresponding eigenvectors of A − qB
  • Λ1, Λ2: submatrices of Λ that has positive and negative eigenvalues
  • ν: transformed mean vector, ν = PTμ, with ith element denoted by νi
  • H: transformed B, H = PTBP, with (i, j)th element denoted by hij

Most symbols not listed here are largely restricted to individual sections.

Theory

Preliminaries

Consider the (simple) ratio of quadratic forms in normal variables: where $\mathbf{x} \sim \qfrmvnorm{\boldsymbol{\mu}}{\mathbf{I}_n}$. The denominator matrix B is assumed to be nonnegative definite, whereas A can be any symmetric matrix.

A more general case where $\mathbf{x} \sim \qfrmvnorm{\boldsymbol{\mu}}{\boldsymbol{\Sigma}}$ can be transformed into the above form when Σ is nonsingular: xnew = K−1x, Anew = KTAK, etc., where K is an n × n matrix that satisfies KKT = Σ (Mathai and Provost, 1992, chap. 3). When Σ is singular, certain conditions need to be met by the argument matrices, Σ, and μ for this transformation, and hence the following expressions, to be valid (Watanabe, 2023, appendix C).

Assuming B to be nonnegative definite, the distribution function of Q is so that it can be expressed as the distribution function of the quadratic form Xq = xT(A − qB)x at 0. We are mostly concerned with such q that makes A − qB indefinite, because otherwise (i.e., when it is positive or negative (semi)definite) FQ(q) = 0, 1 and fQ(q) = 0.

Consider the spectral decomposition A − qB = PΛPT, with an orthogonal matrix of eigenvectors P and a diagonal matrix of eigenvalues $\boldsymbol{\Lambda} = \qfrdiag \left( \lambda_1 , \dots , \lambda_n \right)$, and let $\mathbf{P}^T \mathbf{x} = \mathbf{y} = \left( y_1 , \dots , y_n \right)^T \sim \qfrmvnorm{\boldsymbol{\nu}}{\mathbf{I}_n}$ with ν = PTμ = (ν1, …, νn)T. Then, Obviously, $y_i^2 \sim \qfrnchisq{1}{\nu_i^2}$, a noncentral chi-square variable with 1 degree of freedom and a noncentrality parameter νi2, and by construction these are independent of one another. Thus, Xq is a weighted sum of independent chi-square variables, and the problem boils down to evaluation of the distribution of this quantity.

Series expression

Explicit formulae for the distribution and density function of Q have been worked out by Hillier (2001) and Forchini (2002, 2005). These typically involve infinite series of the top-order zonal and invariant polynomials of matrix arguments. The zonal polynomials are certain homogeneous polynomials of eigenvalues of a symmetric matrix which extend powers of scalars into symmetric matrices (e.g., Muirhead, 1982). The invariant polynomials are further extension of the zonal polynomials to multiple matrix arguments (see Chikuse, 1980, 1987; Davis, 1980). These polynomials are used to integrate out components of rotation from a function of random matrices.

Distribution function

Forchini (2002, 2005) derived explicit expressions of FQ using the top-order zonal and invariant polynomials.

Let Λ from above be arranged and partitioned such that where Λ1(q) and Λ2(q) are n+- and n-dimensional diagonal matrices of positive and negative eigenvalues of A − qB, respectively. By denoting with the partition of ν corresponding to that of the rows of Λ above, the expression of Forchini (2005) is, after correcting some errors, where $\qfrCid{\qfrbrc{k_1}}{\qfrbrc{k_2}}{\qfrbrc{k_1 + k_2}}{ \cdot }{ \cdot }$ are the top-order invariant polynomials of (k1, k2)-th degree (see above for other notations).

In the central case (μ = 0n), the above expression simplifies into (Forchini, 2002) where $\qfrC[\qfrbrc{k}]{ \cdot }$ are the top-order zonal polynomials of kth degree.

These expressions can be numerically evaluated by truncating the infinite series at certain higher-order terms (i1 + j1 + i2 + j2 = m, say), and by using a recursive algorithm to calculate by Hillier et al. (2009, 2014) (see also the main vignette qfratio). The present package implements this algorithm in pqfr(..., method = "forchini") (see below).

The distribution function has points of nonanalyticity around the eigenvalues of B−1A (assuming B−1 is invertible) (Forchini, 2002). Practically speaking, around these points, the series expression can be slow to converge and evaluation of the hypergeometric function can fail because the argument $\qfrtr \left( {\boldsymbol{\Lambda}_1^*}^{-1} \right)$ becomes very close to 1. Otherwise, the series expression can be evaluated with high accuracy, although the computational cost of the recursive calculations can be substantial in large problems.

Density function

Apparently, the literature has an explicit expression for the density of Q only for the simple condition when B = In and μ = 0n. In this case, the distribution of Q does not depend on the norm of x, so any spherically symmetric distribution of x yields the same distribution of Q (Hillier, 2001).

Let η1 > … > ηs be s distinct eigenvalues of A, and n1, …, ns be the corresponding degrees of multiplicity ($\sum_{i=1}^{s} n_i = n$). Consider the transformed variable $V = \frac{Q - \eta_s}{\eta_1 - Q}$ and parameters $\psi_i = \frac{\eta_i - \eta_s}{\eta_1 - \eta_i}$ for i = 2, …s − 1, assuming s > 2 (see below for the case when s = 2). The density of V has different functional forms across its domain (from Hillier, 2001, lemmas 3 and 4): where $p_r = \sum_{i=1}^{r+1} n_i$, $\mathbf{D}_{r} = \qfrdiag \left( \psi_{2}^{-1} \mathbf{I}_{n_{2}} , \dots , \psi_{r+1}^{-1} \mathbf{I}_{n_{r+1}} \right)$, $\mathbf{D}_{r+1} = \qfrdiag \left( \psi_{r+2}^{-1} \mathbf{I}_{n_{r+2}} , \dots , \psi_{s-1}^{-1} \mathbf{I}_{n_{s-1}} \right)$, $\mathbf{D} = \qfrdiag \left( \psi_{2}^{-1} \mathbf{I}_{n_{2}} , \dots , \psi_{s-1}^{-1} \mathbf{I}_{n_{s-1}} \right)$, and cr(j, k) are the coefficients defined as cr(j, k) can be 0 when pr or n − pr is even, so that some terms in the above series disappear. Otherwise (whenever cr(j, k) ≠ 0), it is possible to write to simplify calculation. The density is undefined at v = ψi (q = ηi) for any i.

From one of the above expressions, fQ(q) can be obtained as It is seen that, when s = 2 (i.e., there are only two distinct eigenvalues), the above becomes which is the density of the (scaled) beta distribution in the interval [ηs, η1] with the parameters n1/2 and ns/2. This result is expected from the basic relationship between the chi-square and beta distributions, i.e., $\qfrcchisq{n_1} / \left( \qfrcchisq{n_1} + \qfrcchisq{n_s} \right) = b \sim \qfrBtd{n_1 / 2}{n_s / 2}$, a standard beta distribution in [0, 1], with $\qfrcchisq{n_i}$ being independent chi-square variables; $Q = \left( \eta_1 \qfrcchisq{n_1} + \eta_s \qfrcchisq{n_s} \right) / \left( \qfrcchisq{n_1} + \qfrcchisq{n_s} \right) = \eta_1 b + \eta_s \left( 1 - b \right) = \left( \eta_1 - \eta_s \right) b + \eta_s$.

As for the above series expression of FQ, these expressions can be evaluated by taking a partial sum of the series and using the recursive algorithm for d. dqfr(..., method = "hillier") implements this algorithm (see below). This is reasonably quick and accurate in small problems, but the computational cost can be substantial in large problems.

Numerical inversion

A popular way to numerically evaluate the distribution function of Q is to use the inversion formula of the characteristic function (e.g., Stuart and Ord, 1994, chap. 4).

Distribution function

From the famous formula of Imhof (1961) on the distribution of Xq, where

The above integral can be evaluated by using a regular numerical evaluation algorithm for infinite intervals. Alternatively, it can be evaluated by the trapezoidal integration algorithm of Davies (1973, 1980) which explicitly controls the numerical errors involved. The package CompQuadForm implements these methods in the function imhof() and davies(), respectively (strictly speaking, these are for 1 − FQ as per Imhof’s original result). The present package utilizes those functions via pqfr(..., method = "imhof", use_cpp = FALSE) and pqfr(..., method = "davies"), respectively (see below). For the former method, the present package also has its own C++ implementation used via pqfr(..., method = "imhof", use_cpp = TRUE) (default). The numerical inversion can be evaluated fairly quickly on modern computers, and the accuracy will be sufficient for most practical purposes with slight error in numerical integration.

Density function

The density can be evaluated by numerical inversion of the characteristic function using Geary’s formula (Geary, 1944; Stuart and Ord, 1994, sec. 11.10). Broda and Paolella (2009) demonstrated that where, along with β and γ defined above, with H = PTBP = (hij) and F = In + u2Λ2.

The above expression can be evaluated with a regular numerical integration algorithm, and is implemented in dqfr(..., method = "broda") (see below).

Saddlepoint approximation

Saddlepoint approximation (Butler, 2007; Paolella, 2007, chap. 5) provides an alternative way to evaluate (or approximate) FQ and fQ.

Let MXq(s) be the moment generating function of Xq, Also, let KXq(s) = log MXq(s) be the corresponding cumulant generating function. These are convergent within the interval 1/2λn < s < 1/2λ1, where λ1 and λn are the largest and smallest of the eigenvalues (which are positive and negative, respectively; see above).

For Xq, the saddlepoint root is defined as the unique root of and is found numerically within the above interval.

Distribution function

A first-order saddlepoint approximation formula for the distribution function FQ is (Butler and Paolella, 2007, 2008): where Φ(⋅) and ϕ(⋅) are the distribution and density functions, respectively, of the standard normal distribution, and The condition $\qfrE \left( X_q \right) = 0$ is equivalent to  = 0, because of the elementary property of the cumulant generating function $\qfrE \left( X_q \right) = K_{X_q}' \left( 0 \right)$. Higher-order derivatives of KXq are (see also Paolella, 2007, chap. 10)

A more accurate second-order approximation is (Butler and Paolella, 2007) where κ̂j = KXq(j)()/KXq″()j/2.

Evaluation of saddlepoint approximation is fairly quick, with the only potential complexity arising from the numerical root-finding. Empirically, the accuracy of this approximation seems to improve for large problems. This is expected since the distribution of Xq as a weighted sum approaches normality as n increases.

pqfr(..., method = "butler") implements this saddlepoint approximation. The second-order approximation is used by default (order_spa = 2) (see below).

Density function

A first-order saddlepoint approximation for the density function fQ is (Butler and Paolella, 2007, 2008) where is the same saddlepoint root used above, and using notations defined above and Ξ = In − 2sΛ.

A second-order approximation is (Butler and Paolella, 2007) where with (Ξ−1 and Λ commute since these are diagonal matrices).

dqfr(..., method = "butler") implements this saddlepoint approximation in a very similar way to pqfr(..., method = "butler") (see below).

Implementation details

Exported functions

The above expressions for the distribution and density functions are implemented in pqfr() and dqfr(), which are defined as:

pqfr <- function(quantile, A, B, p = 1, mu = rep.int(0, n), Sigma = diag(n),
                 lower.tail = TRUE, log.p = FALSE,
                 method = c("imhof", "davies", "forchini", "butler"),
                 trim_values = TRUE, return_abserr_attr = FALSE, m = 100L,
                 tol_zero = .Machine$double.eps * 100,
                 tol_sing = tol_zero, ...) { ... }
dqfr <- function(quantile, A, B, p = 1, mu = rep.int(0, n), Sigma = diag(n),
                 log = FALSE, method = c("broda", "hillier", "butler"),
                 trim_values = TRUE, normalize_spa = FALSE,
                 return_abserr_attr = FALSE, m = 100L,
                 tol_zero = .Machine$double.eps * 100,
                 tol_sing = tol_zero, ...) { ... }

The basic usage is similar to that of regular probability distribution functions like stats::pnorm(), just with many optional arguments to specify evaluation methods and behaviors at edge cases. These functions are (pseudo-)vectorized with respect to quantile (a vector of q), using sapply(). Log-transformed p-values or densities can be obtained by turning log.p = TRUE or log = TRUE, but these are just ad-hoc transformations of the results so are not supposed to provide as much numerical accuracy as in regular probability distribution functions.

There is also qqfr() for the corresponding quantile function, which is defined as:

qqfr <- function(probability, A, B, p = 1, mu = rep.int(0, n), Sigma = diag(n),
                 lower.tail = TRUE, log.p = FALSE, trim_values = FALSE,
                 return_abserr_attr = FALSE, stop_on_error = FALSE, m = 100L,
                 tol_zero = .Machine$double.eps * 100,
                 tol_sing = tol_zero, epsabs_q = .Machine$double.eps ^ (1/2),
                 maxiter_q = 5000, ...) { ... }

This function is not based on any explicit inverse function, but does numerical root-finding using stats::uniroot(), internally calling pqfr(); i.e., by searching the root q of pqfr(q, ...) - probability = 0.

Internally, these functions first check basic argument structures, and, if Sigma is specified, transform the arguments A, B, and mu and call themselves recursively with new arguments.

Choosing a method

The evaluation method is specified by the argument method in these functions (by default, both functions choose a numerical inversion method). According to the choice, the actual calculations are done in one of the internal functions described below. Direct use of the internal functions are not recommended. These internal functions only accept a length-one quantile. To reduce computational time, they do not check argument structures or accommodate Sigma.

## Choice from alternative methods
A <- diag(1:3)
pqfr(1.5, A, method = "imhof")    # default
#> [1] 0.1978686
pqfr(1.5, A, method = "davies")   # similar
#> [1] 0.1979048
pqfr(1.5, A, method = "forchini") # series
#> [1] 0.1978686
pqfr(1.5, A, method = "butler")   # spa
#> [1] 0.1897189

dqfr(1.5, A, method = "broda")    # default
#> [1] 0.4506431
dqfr(1.5, A, method = "hillier")  # series
#> [1] 0.4506431
dqfr(1.5, A, method = "butler")   # spa
#> [1] 0.4523631

## Not recommended; for diagnostic use only
qfratio:::pqfr_imhof(1.5, A)
#> $p
#> [1] 0.1978686
#> 
#> $abserr
#> [1] 2.031334e-09
qfratio:::pqfr_A1B1(1.5, A, m = 9, check_convergence = FALSE)
#> $p
#> [1] 0.1978641
#> 
#> $terms
#>  [1] 1.495393e-01 3.427348e-02 9.516357e-03 2.974620e-03 1.002253e-03
#>  [6] 3.544878e-04 1.295434e-04 4.844575e-05 1.842967e-05 7.103869e-06


## This is okay
x <- c(1.5, 2.5, 3.5)
pqfr(x, A)
#> [1] 0.1978686 0.8021314 1.0000000

## This is not
qfratio:::pqfr_imhof(x, A)
#> Error in qfratio:::pqfr_imhof(x, A): In pqfr_imhof, quantile must be length-one

Use with ks.test()

In principle, pqfr() is compatible with stats:::ks.test(), but care must be exercised as evaluation result may involve non-trivial error. It is recommended to inspect error bounds beforehand, using pqfr(..., method = "imhof", return_abserr_attr = TRUE). In addition, the argument B is pre-occupied by the same-named argument in ks.test(), so it cannot be passed via ...; this means that a typical syntax with non-default B should be something like ks.test(x, function(q) pqfr(q, A, B, ...)) rather than ks.test(x, pqfr, A = A, B = B, ...)). For example,

## Small Monte Carlo sample
A <- diag(1:3)
B <- diag(sqrt(1:3))
x <- rqfr(10, A, B)

## Calculate p-values
pseq <- pqfr(x, A, B, return_abserr_attr = TRUE)

## Maximum error when evaluated at x;
## looks small enough
max(attr(pseq, "abserr"))
#> [1] 8.318564e-07

## Correct syntax, expected outcome
## \(q) syntax could also be used in recent versions of R
ks.test(x, function(q) pqfr(q, A, B))
#> 
#>  Exact one-sample Kolmogorov-Smirnov test
#> 
#> data:  x
#> D = 0.29324, p-value = 0.2946
#> alternative hypothesis: two-sided

rather than

## Incorrect; no error/warning because
## B is passed to ks.test rather than to pqfr
ks.test(x, pqfr, A = A, B = B)
#> 
#>  Exact one-sample Kolmogorov-Smirnov test
#> 
#> data:  x
#> D = 0.78335, p-value = 5.182e-07
#> alternative hypothesis: two-sided

Series expressions

## Used in pqfr(..., method = "forchini")
pqfr_A1B1 <- function(quantile, A, B, m = 100L, mu = rep.int(0, n),
                      check_convergence = c("relative", "strict_relative",
                                            "absolute", "none"),
                      stop_on_error = FALSE, use_cpp = TRUE,
                      cpp_method = c("double", "long_double", "coef_wise"),
                      nthreads = 1,
                      tol_conv = .Machine$double.eps ^ (1/4),
                      tol_zero = .Machine$double.eps * 100,
                      tol_sing = tol_zero,
                      thr_margin = 100) { ... }
## Used in dqfr(..., method = "hillier")
dqfr_A1I1 <- function(quantile, LA, m = 100L,
                      check_convergence = c("relative", "strict_relative",
                                            "absolute", "none"),
                      use_cpp = TRUE,
                      tol_conv = .Machine$double.eps ^ (1/4),
                      thr_margin = 100) { ... }

These functions evaluate the above series expressions as partial sums of the infinite series, using the recursive algorithm to calculate d (d1_i() or d2_ij_m()), as in the moment-related functions of this package (see the vignette for moments vignette("qfratio")). Most of the arguments are common with those functions.

A <- diag(1:3)
pqfr(1.5, A, method = "forchini")
#> [1] 0.1978686
dqfr(1.5, A, method = "hillier")
#> [1] 0.4506431

B <- diag(sqrt(1:3))
pqfr(1.5, A, B, method = "forchini")
#> [1] 0.6376791
## dqfr method does not accommodate B, mu, or Sigma
dqfr(1.5, A, B, method = "hillier")
#> Error in dqfr(1.5, A, B, method = "hillier"): dqfr() does not accommodate B, mu, or Sigma with method = "hillier"

As stated above, the density is undefined, and the distribution function has points of nonanalyticity, at the eigenvalues of B−1A (assuming nonsingular B; Hillier 2001; Forchini 2002). Around these points, convergence of the series expressions tends to be very slow, and the evaluation of hypergeometric function involved in the distribution function may fail. In this case, avoid using the series expression methods.

A <- diag(1:3)

## p-value just below 2, an eigenvalue of A
## Typically throws two warnings:
##   Maximum iteration in hypergeometric function
##   and non-convergence of series
pqfr(1.9999, A, method = "forchini")
#> Warning in p_A1B1_Ed(quantile, A, B, mu, m, stop_on_error, thr_margin, nthreads, : problem in gsl_sf_hyperg_2F1_e():
#>   max iteration reached
#> Warning in pqfr_A1B1(q, A, B, m = m, mu = mu, tol_zero = tol_zero, ...): Last term is >1.2e-04 times as large as the series,
#>   suggesting non-convergence. Consider using larger m
#> [1] 0.1052446

## More realistic value; expected from symmetry
pqfr(1.9999, A, method = "imhof")
#> [1] 0.4998044

Numerical inversion

## Used in pqfr(..., method = "imhof") (default)
pqfr_imhof <- function(quantile, A, B, mu = rep.int(0, n),
                       autoscale_args = 1, stop_on_error = TRUE, use_cpp = TRUE,
                       tol_zero = .Machine$double.eps * 100,
                       epsabs = epsrel, epsrel = 1e-6, limit = 1e4) { ... }
## Used in pqfr(..., method = "davies")
pqfr_davies <- function(quantile, A, B, mu = rep.int(0, n),
                        autoscale_args = 1,
                        tol_zero = .Machine$double.eps * 100, ...) { ... }
## Used in dqfr(..., method = "broda") (default)
dqfr_broda <- function(quantile, A, B, mu = rep.int(0, n),
                       autoscale_args = 1, stop_on_error = TRUE,
                       use_cpp = TRUE, tol_zero = .Machine$double.eps * 100,
                       epsabs = epsrel, epsrel = 1e-6, limit = 1e4) { ... }

pqfr_imhof(..., use_cpp = TRUE) and dqfr_broda(..., use_cpp = TRUE) conduct numerical integration by the C function gsl_integration_qagi(..., epsabs, epsrel, limit) from GSL. The arguments epsabs, epsrel, and limit determine the permissible bounds of absolute and relative errors, and the maximum number of integration intervals, respectively. dqfr_broda(..., use_cpp = FALSE) uses the R function stats::integrate(..., rel.tol = epsrel, abs.tol = epsabs, stop.on.error = stop_on_error), instead, and limit is ignored. pqfr_imhof(..., use_cpp = FALSE) and pqfr_davies() calculate appropriate parameters from the arguments and pass them to imhof() and davies() from the CompQuadForm package.

Specifying integration error

The above integration functions try to find an absolute error bound eI that is bounded by the user-specified tolerance for absolute ϵabs and relative ϵrel errors: eI ≤ ϵabs + |I|ϵrel, where I is the result of integration.

Internally, ϵabs is calculated from the user-specified arguments epsabs and epsrel to appropriately constrain the density or distribution function (whereas ϵrel is always specified by epsrel). In dqfr_broda(), pi * epsabs is used as ϵabs, and the resultant error bound abserr is subsequently divided by pi, so is the integration result itself to yield the density: fQ = I/π (see above).

Situation is more complicated for pqfr_imhof(), because the relative error in I cannot in general be directly transformed to that of the distribution function, which is FQ = 1/2 − I/π (see above). In this function, pi * (epsabs * epsrel / 2) is passed as ϵabs, and the resultant error bound abserr is divided by pi. This procedure ensures an equivalent of the above inequality to hold for FQ, provided I ≤ 0 (FQ ≥ 1/2) or ϵrel = 0. Otherwise, an error bound calculated in the same way can only be conservative; pqfr_imhof() returns this value, but it can violate the user-specified relative tolerance epsrel.

A <- diag(1:4)

## This error bound satisfies "abserr < value * epsrel"
pqfr(3.9, A, method = "imhof", return_abserr_attr = TRUE,
     epsabs = 0, epsrel = 1e-6)
#> [1] 0.9944167
#> attr(,"abserr")
#> [1] 3.325725e-08

## This one violates "abserr < value * epsrel",
## although abserr is a valid error bound
pqfr(1.2, A, method = "imhof", return_abserr_attr = TRUE,
     epsabs = 0, epsrel = 1e-6)
#> [1] 0.01611023
#> attr(,"abserr")
#> [1] 4.165996e-07

autoscale_args

Numerical integration involved in these functions typically fail when the magnitude of eigenvalues is too small or too large, whence the integrand functions can decrease too slowly (i.e., divergent-looking) or too quickly (i.e., looks constant 0) with respect to the integration parameter (u above). To avoid such failures, these functions internally scale the eigenvalues by default, so that max λi − min λi is equal to the argument autoscale_args (default 1); remember that min λi is negative, so this quantity is sum of the absolute values.

A <- diag(1:3)
B <- diag(sqrt(1:3))

## Without autoscale_args
## We know these are equal
pqfr(1.5, A, B, autoscale_args = FALSE)
#> [1] 0.6376791
pqfr(1.5, A * 1e-10, B * 1e-10, autoscale_args = FALSE)
#> [1] 0.5
## The latter failed because of numerically small eigenvalues

## With autoscale_args = 1 (default)
pqfr(1.5, A * 1e-10, B * 1e-10)
#> [1] 0.6376791

trim_values

Numerical integration can yield spurious results that are outside the mathematically permissible supports; [0, ∞) and [0, 1] for the density and distribution functions, respectively. By default (trim_values = TRUE), the external functions dqfr() and pqfr() trim those values into the permissible range by using tol_zero as a margin; e.g., negative p-values are replaced by ~2.2e-14 (default tol_zero). A warning is thrown if this happens, because it usually means that numerically accurate evaluation was impossible, at least with the given parameters. Turn trim_values = FALSE to skip these trimming and warning, although pqfr_imhof() and pqfr_davies() can still throw a warning from CompQuadForm functions. Note that, on the other hand, all these functions try to return exact 0 or 1 when q is outside the possible range of Q (as numerically determined).

## Result without trimming;
## (typically) negative density, which is absurd
## In this case, error interval typically spans across 0
dqfr(1.2, diag(1:30), return_abserr_attr = TRUE,
     trim_values = FALSE)
#> [1] -4.859181e-17
#> attr(,"abserr")
#> [1] 8.043729e-08

## Result with trimming (default)
dqfr(1.2, diag(1:30), return_abserr_attr = TRUE)
#> Warning in dqfr(1.2, diag(1:30), return_abserr_attr = TRUE): values < 0 trimmed
#> up to tol_zero
#> [1] 2.220446e-14
#> attr(,"abserr")
#> [1] 8.043726e-08
## Note that the actual value is only bounded by
## 0 and abserr

Saddlepoint approximation

## Used in pqfr(..., method = "butler")
pqfr_butler <- function(quantile, A, B, mu = rep.int(0, n),
                        order_spa = 2, stop_on_error = FALSE, use_cpp = TRUE,
                        tol_zero = .Machine$double.eps * 100,
                        epsabs = .Machine$double.eps ^ (1/2), epsrel = 0,
                        maxiter = 5000) { ... }
## Used in dqfr(..., method = "butler")
dqfr_butler <- function(quantile, A, B, mu = rep.int(0, n),
                        order_spa = 2, stop_on_error = FALSE, use_cpp = TRUE,
                        tol_zero = .Machine$double.eps * 100,
                        epsabs = .Machine$double.eps ^ (1/2), epsrel = 0,
                        maxiter = 5000) { ... }

These functions evaluate the saddlepoint approximations described above. They conduct numerical root-finding for the saddlepoint by the Brent method (C function gsl_root_fsolver_brent from GSL), with the stopping rule specified by gsl_root_test_delta(..., epsabs, epsrel) and the maximum number of iteration by maxiter. When use_cpp = FALSE, the R function stats::uniroot(..., check.conv = stop_on_error, tol = epsabs, maxiter = maxiter) is used instead, and epsrel is ignored. The Newton–Raphson method was also explored in the development stage, but that method sometimes failed because the derivative can be numerically close to 0.

Options

The saddlepoint approximation density does not integrate to unity, but can be normalized by setting normalize_spa = TRUE in dqfr() (note that this is done in the external function). The normalized density can be more accurate (although it is usually a matter of empiricism). However, this is usually slower than the numerical inversion method for a small number of quantiles.

The second-order approximation is used by default (order_spa = 2) (internally, any value > 1 calls this option). The first-order approximation can be used by setting order_spa = 1, but this is usually less accurate and only slightly faster than the second-order approximation.

A <- diag(1:3)

## Default for spa distribution function
pqfr(1.2, A, method = "butler", order_spa = 2)
#> [1] 0.07183068

## First-order spa
pqfr(1.2, A, method = "butler", order_spa = 1)
#> [1] 0.0790331

## More accurate numerical inversion
pqfr(1.2, A)
#> [1] 0.07359703


## Default for density
dqfr(1.2, A, method = "butler",
     order_spa = 2, normalize_spa = FALSE)
#> [1] 0.3716931

## First-order
dqfr(1.2, A, method = "butler",
     order_spa = 1, normalize_spa = FALSE)
#> [1] 0.4577787

## Normalized density, second-order
dqfr(1.2, A, method = "butler",
     order_spa = 2, normalize_spa = TRUE)
#> [1] 0.3913688

## Normalized density, first-order
dqfr(1.2, A, method = "butler",
     order_spa = 1, normalize_spa = TRUE)
#> [1] 0.4412349

## More accurate numerical inversion
dqfr(1.2, A)
#> [1] 0.3837318

Paolella (2007, program listing 10.4) noted that the second-order approximation for the distribution function can be “problematic”, which presumably means that the evaluation result can be unstable. In development of this package, some instability in the second-order approximation was encountered, but experiments suggest that this was due to sensitivity of the result to the numerically found root . This instability is rarely encountered with the present default setting, but the user may want to adjust root-finding-related parameters when any doubt exists.

Error bound

pqfr() and dqfr()

Return values from pqfr_imhof() and dqfr_broda() have an error bound abserr for numerical integration, along with the evaluation result itself. Technically, the error bound from the integration algorithm is divided by pi before returned, as the evaluation result itself is. This can be passed to the external functions and returned as an attribute by setting return_abserr_attr = TRUE (as already used in above examples):

A <- diag(1:4)

pqfr(1.5, A, return_abserr_attr = TRUE)
#> [1] 0.06819534
#> attr(,"abserr")
#> [1] 9.576418e-08

dqfr(1.5, A, return_abserr_attr = TRUE)
#> [1] 0.22202
#> attr(,"abserr")
#> [1] 5.738788e-08

This error bound tries to accommodate the effect of trim_values. If the integration result is outside the permissible support (e.g., negative density), the possible error bound is only on the direction toward the support (assuming things are calculated accurately). The returned abserr is truncated accordingly, unless trimming is beyond the original abserr (in which case it is replaced by tol_zero). See this in examples:

## Without trimming, result is (typically) negative
## But note that value + abserr is positive
dqfr(1.2, diag(1:35), return_abserr_attr = TRUE,
     epsabs = 1e-10, trim_values = FALSE)
#> [1] -6.626156e-18
#> attr(,"abserr")
#> [1] 2.378925e-12

## With trimming, value is replaced by tol_zero
## Note slightly shortened abserr
dqfr(1.2, diag(1:35), return_abserr_attr = TRUE,
     epsabs = 1e-10)
#> Warning in dqfr(1.2, diag(1:35), return_abserr_attr = TRUE, epsabs = 1e-10):
#> values < 0 trimmed up to tol_zero
#> [1] 2.220446e-14
#> attr(,"abserr")
#> [1] 2.356714e-12


## When untrimmed value + abserr < tol_zero
dqfr(1.1, diag(1:35), return_abserr_attr = TRUE,
     epsabs = 1e-15, trim_values = FALSE)
#> [1] -3.313078e-18
#> attr(,"abserr")
#> [1] 7.760891e-16
## True value is somewhere between 0 and value + abserr
## (assuming these are reliable)

## When trimmed, abserr reflects tol_zero
## because the true value is between 0 and tol_zero
dqfr(1.1, diag(1:35), return_abserr_attr = TRUE,
     epsabs = 1e-15)
#> Warning in dqfr(1.1, diag(1:35), return_abserr_attr = TRUE, epsabs = 1e-15):
#> values < 0 trimmed up to tol_zero
#> [1] 2.220446e-14
#> attr(,"abserr")
#> [1] 2.220446e-14

When log/log.p = TRUE, abserr is transformed so that it is a conservative absolute error bound on the log scale. That is, if the original value and its error bound is denoted by and δ, respectively, and the log-transformed value and its error bound is by log  and δ(log ), the latter error bound is set so that log  − δ(log ) = log ( − δ), i.e., $\delta (\log \hat{x}) = - \log \left( 1 - \frac{\delta \hat{x}}{\hat{x}} \right)$. Note that the upper error bound $\log \left( 1 + \frac{\delta \hat{x}}{\hat{x}} \right)$ is narrower than this unless δ >  (i.e.,  − δ < 0), in which case it should be taken as δ(log ) = ∞. In summary, the new error bound is calculated as ifelse(abserr > ans, Inf, -log1p(-abserr/ans)).

qqfr()

The option return_abserr_attr = TRUE is available in qqfr() as well:

A <- diag(1:4)

qqfr(0.95, A, return_abserr_attr = TRUE)
#> [1] 3.587557
#> attr(,"abserr")
#> [1] 6.648746e-06

In qqfr(), numerical errors arise from the root-finding with stats::uniroot() as well as in propagation from pqfr(). When return_abserr_attr = TRUE, it tries to evaluate a conservative error bound as follows:

  1. Store the estimated error in root-finding uniroot()$estim.prec as δqrf
  2. Store that in pqfr() at the root as δF
  3. If δF ≠ 0, calculate the density f and its error bound δf at the root using dqfr(..., method = "broda"), so that the conservative slope of the tangent there is b = max (f − δf, 0). The error δqp in the root arising from pqfr() is at most b−1δF. If δF = 0, δqp = 0 regardless of b.
  4. The total error in the quantile is δqrf + δqp

If log.p = TRUE, the root-finding is done on log F, so the slope used in 3 is replaced by b = max (f − δf, 0)/F.

For probability = 0 or 1, the quantile corresponds to the minimum or maximum of the ratio, and the above error bound does not apply. At present, an arbitrary value of .Machine$double.eps * 100 (~2.2e-14) is returned as an error bound for a finite minimum/maximum, although the actual error in calculation can be larger.

qqfr(0, A, return_abserr_attr = TRUE)
#> [1] 1
#> attr(,"abserr")
#> [1] 2.220446e-14

Distribution of powers

For completeness, pqfr() and dqfr() can be used to evaluate powers of ratios of quadratic forms, Qp, with the exponent specified by the argument p (default 1). Note that, unlike moment-related functions of this package, the numerator and denominator must have the same exponent. When p != 1, these functions return appropriate results typically by transforming those from p == 1 with recursive calling.

For the rest of this section, consider the distribution and density functions of R = Qp at r = qp. The Jacobian for the density is $\left| \frac{\mathrm{d} q}{\mathrm{d} r} \right| = \frac{1}{p} \left| r \right| ^ {\frac{1}{p} - 1}$.

When A is nonnegative definite or p is an odd integer

In this case, the relationship between Q and R is one-to-one, so that Thus, the result can be obtained by a single recursive call of pqfr(..., p = 1) or dqfr(..., p = 1) with transformed quantile.

When A is indefinite and p is an even integer

In this case, R is an even function of Q, so that Thus, for r > 0, the result is obtained from two recursive calls of pqfr(..., p = 1) or dqfr(..., p = 1) with transformed quantile.

When A is indefinite and p is non-integer

In this case, R can be undefined, so pqfr() and dqfr() return an error, "A must be nonnegative definite when p is non-integer", regardless of the value of quantile.

Graphical examples

First we compare evaluation methods for the distribution function:

A <- diag(1:4)
qseq <- seq.int(0.8, 4.2, length.out = 100)

## Generate p-value sequences
## Warning is expected
pseq_inv <- pqfr(qseq, A, method = "imhof",
                 return_abserr_attr = TRUE)
pseq_ser <- pqfr(qseq, A, method = "forchini",
                 check_convergence = FALSE)
#> Warning in p_A1B1_Ed(quantile, A, B, mu, m, stop_on_error, thr_margin, nthreads, : problem in gsl_sf_hyperg_2F1_e():
#>   evaluation failed due to singularity
#>   max iteration reached
pseq_spa <- pqfr(qseq, A, method = "butler")

## Maximum error in numerical inversion;
## looks small enough
max(attr(pseq_inv, "abserr"))
#> [1] 1.269026e-06

## Graphical comparison
par(mar = c(4, 4, 0.1, 0.1))
plot(qseq, type = "n", xlim = c(1, 4), ylim = c(0, 1),
     xlab = "q", ylab = "F(q)")
lines(qseq, pseq_inv, col = "gray", lty = 1)
lines(qseq, pseq_ser, col = "tomato", lty = 2)
lines(qseq, pseq_spa, col = "slateblue", lty = 3)
legend("topleft", legend = c("inversion", "series", "saddlepoint"),
       col = c("gray", "tomato", "slateblue"), lty = 1:3, cex = 0.8)


## Logical vector to exclude q around eigenvalues of A
avoid_evals <- ((qseq %% 1) > 0.05) & ((qseq %% 1) < 0.95)

## Numerical comparison
all.equal(pseq_inv[avoid_evals], pseq_ser[avoid_evals],
          check.attributes = FALSE)
#> [1] "Mean relative difference: 0.0007499845"
all.equal(pseq_inv[avoid_evals], pseq_spa[avoid_evals],
          check.attributes = FALSE)
#> [1] "Mean relative difference: 0.009893572"

Around the eigenvalues of A, the series expression is slow to converge; this could partly be mitigated by using larger m (default 100), but that will usually be time-consuming, and evaluation of hypergeometric function may fail regardless (for which a warning is already thrown above). Apart from these points, the series and numerical inversion methods yield very similar values. The saddlepoint approximation yields slightly inaccurate result, but is usually the fastest among these methods.

Next, we compare methods for the probability density:

## Generate p-value sequences
dseq_inv <- dqfr(qseq, A, method = "broda",
                 return_abserr_attr = TRUE)
dseq_ser <- dqfr(qseq, A, method = "hillier",
                 check_convergence = FALSE)
dseq_spa <- dqfr(qseq, A, method = "butler")

## Maximum error in numerical inversion;
## looks small enough
max(attr(dseq_inv, "abserr"))
#> [1] 8.574178e-07

## Graphical comparison
par(mar = c(4, 4, 0.1, 0.1))
plot(qseq, type = "n", xlim = c(1, 4), ylim = c(0, 0.8),
     xlab = "q", ylab = "f(q)")
lines(qseq, dseq_inv, col = "gray", lty = 1)
lines(qseq, dseq_ser, col = "tomato", lty = 2)
lines(qseq, dseq_spa, col = "slateblue", lty = 3)
legend("topleft", legend = c("inversion", "series", "saddlepoint"),
       col = c("gray", "tomato", "slateblue"), lty = 1:3, cex = 0.8)


## Numerical comparison
all.equal(dseq_inv, dseq_ser, check.attributes = FALSE)
#> [1] "Mean relative difference: 8.863703e-07"
all.equal(dseq_inv, dseq_spa, check.attributes = FALSE)
#> [1] "Mean relative difference: 0.05382156"

## Do densities sum up to 1?
sum(dseq_inv * diff(qseq)[1])
#> [1] 1.001194
sum(dseq_ser * diff(qseq)[1])
#> [1] 1.001193
sum(dseq_spa * diff(qseq)[1])
#> [1] 0.9613508

The series expression looks successful across the range. The saddlepoint approximation usually fails to capture a fancy profile as seen in the above plot. That will be less of a concern as the dimensionality increases, in which case the distribution approaches normality.

The last three lines conduct a rough check on whether the densities integrate/sum up to unity. The results for the inversion and series methods are expected to approach 1 as we use a finer sequence. The saddlepoint approximation density could be normalized at the cost of slight computational time, although the normalization may or may not yield more accurate results at a particular quantile:

## Normalized saddlepoint approximation density
dseq_spa_normalized <- dqfr(qseq, A, method = "butler",
                            normalize_spa = TRUE)
all.equal(dseq_inv, dseq_spa_normalized,
          check.attributes = FALSE)
#> [1] "Mean relative difference: 0.05248822"
sum(dseq_spa_normalized * diff(qseq)[1])
#> [1] 1.000244

References

Broda, S. and Paolella, M. S. (2009) Evaluating the density of ratios of noncentral quadratic forms in normal variables. Computational Statistics and Data Analysis, 53, 1264–1270. doi:10.1016/j.csda.2008.10.035.
Butler, R. W. (2007) Saddlepoint Approximations with Applications. Cambridge, UK: Cambridge University Press. doi:10.1017/CBO9780511619083.
Butler, R. W. and Paolella, M. S. (2007) Uniform saddlepoint approximations for ratios of quadratic forms. Technical Reports, Department of Statistical Science, Southern Methodist University, no. 351. [Available on arXiv as a preprint.] doi:10.48550/arXiv.0803.2132.
Butler, R. W. and Paolella, M. S. (2008) Uniform saddlepoint approximations for ratios of quadratic forms. Bernoulli, 14, 140–154. doi:10.3150/07-BEJ6169.
Chikuse, Y. (1980) Invariant polynomials with matrix arguments and their applications. In: Gupta, R. P., (ed.), Multivariate Statistical Analysis. Amsterdam: North-Holland. pp. 53–68.
Chikuse, Y. (1987) Methods for constructing top order invariant polynomials. Econometric Theory, 3, 195–207. doi:10.1017/S026646660001029X.
Davies, R. B. (1973) Numerical inversion of a characteristic function. Biometrika, 60, 415–417. doi:10.1093/biomet/60.2.415.
Davies, R. B. (1980) Algorithm AS 155: The distribution of a linear combination of χ2 random variables. Journal of the Royal Statistical Society, Series C: Applied Statistics, 29, 323–333. doi:10.2307/2346911.
Davis, A. W. (1980) Invariant polynomials with two matrix arguments, extending the zonal polynomials. In: Krishnaiah, P. R., (ed.), Multivariate Analysis—V. Amsterdam: North-Holland. pp. 287–299.
Forchini, G. (2002) The exact cumulative distribution function of a ratio of quadratic forms in normal variables, with application to the AR(1) model. Econometric Theory, 18, 823–852. doi:10.1017/s0266466602184015.
Forchini, G. (2005) The distribution of a ratio of quadratic forms in noncentral normal variables. Communications in Statistics—Theory and Methods, 34, 999–1008. doi:10.1081/STA-200056855.
Geary, R. C. (1944) Extension of a theorem by Harald Cramér on the frequency distribution of the quotient of two variables. Journal of the Royal Statistical Society, 107, 56–57. doi:10.1111/j.2397-2335.1944.tb01588.x.
Hillier, G. (2001) The density of a quadratic form in a vector uniformly distributed on the n-sphere. Econometric Theory, 17, 1–28. doi:10.1017/S026646660117101X.
Hillier, G., Kan, R. and Wang, X. (2009) Computationally efficient recursions for top-order invariant polynomials with applications. Econometric Theory, 25, 211–242. doi:10.1017/S0266466608090075.
Hillier, G., Kan, R. and Wang, X. (2014) Generating functions and short recursions, with applications to the moments of quadratic forms in noncentral normal vectors. Econometric Theory, 30, 436–473. doi:10.1017/S0266466613000364.
Imhof, J. P. (1961) Computing the distribution of quadratic forms in normal variables. Biometrika, 48, 419–426. doi:10.2307/2332763.
Mathai, A. M. and Provost, S. B. (1992) Quadratic Forms in Random Variables: Theory and Applications. New York, New York: Marcel Dekker.
Muirhead, R. J. (1982) Aspects of Multivariate Statistical Theory. Hoboken, New Jersey: John Wiley & Sons. doi:10.1002/9780470316559.
Paolella, M. S. (2007) Intermediate Probability: A Computational Approach. Chichester, UK: John Wiley & Sons. doi:10.1002/9780470035061.
Stuart, A. and Ord, J. K. (1994) Kendall’s Advanced Theory of Statistics, Vol. 1: Distribution Theory, 6th ed. London: Hodder Education. [Reprinted by John Wiley & Sons.]
Watanabe, J. (2023) Exact expressions and numerical evaluation of average evolvability measures for characterizing and comparing G matrices. Journal of Mathematical Biology, 86, 95. doi:10.1007/s00285-023-01930-8.