Title: | Distribution Utilities |
---|---|
Description: | Utilities are provided which are of use in the packages I have developed for dealing with distributions. Currently these packages are GeneralizedHyperbolic, VarianceGamma, and SkewHyperbolic and NormalLaplace. Each of these packages requires DistributionUtils. Functionality includes sample skewness and kurtosis, log-histogram, tail plots, moments by integration, changing the point about which a moment is calculated, functions for testing distributions using inversion tests and the Massart inequality. Also includes an implementation of the incomplete Bessel K function. |
Authors: | David Scott <[email protected]> |
Maintainer: | David Scott <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.6-1 |
Built: | 2024-11-01 11:45:28 UTC |
Source: | CRAN |
Functionality includes sample skewness and kurtosis, log-histogram, tail plots, moments by integration, changing the point about which a moment is calculated, functions for testing distributions using inversion tests and the Massart inequality. Also includes an implementation of the incomplete Bessel K function.
Contains functions which are useful for packages implementing distributions. Designed to work with my packages GeneralizedHyperbolic, VarianceGamma, SkewHyperbolic and NormalLaplace.
David Scott <[email protected]>
Maintainer: David Scott <[email protected]>
Scott, David J. and Würtz, Diethelm and Dong, Christine (2009) Software for Distributions in R. UseR: The R User Conference 2009 https://www.r-project.org/conferences/useR-2009/slides/Scott+Wuertz+Dong.pdf
GeneralizedHyperbolicDistribution
Calculates the ratio of Bessel K functions of different orders, but the same value of the argument.
besselRatio(x, nu, orderDiff, useExpScaled = 700)
besselRatio(x, nu, orderDiff, useExpScaled = 700)
x |
Numeric, |
nu |
Numeric. The order of the Bessel function in the denominator. |
orderDiff |
Numeric. The order of the numerator Bessel function minus the order of the denominator Bessel function. |
useExpScaled |
Numeric, |
Uses the function besselK
to calculate the ratio of two
modified Bessel function of the third kind whose orders are
different. The calculation of Bessel functions will underflow if the
value of is greater than around 740. To avoid underflow the
exponentially-scaled Bessel functions can be returned by
besselK
. The ratio is actually unaffected by exponential
scaling since the scaling cancels across numerator and denominator.
The Bessel function ratio is useful in calculating moments of the generalized inverse Gaussian distribution, and hence also for the moments of the hyperbolic and generalized hyperbolic distributions.
The ratio
of two modified Bessel functions of the third kind whose orders differ
by .
David Scott [email protected]
nus <- c(0:5, 10, 20) x <- seq(1, 4, length.out = 11) k <- 3 raw <- matrix(nrow = length(nus), ncol = length(x)) scaled <- matrix(nrow = length(nus), ncol = length(x)) compare <- matrix(nrow = length(nus), ncol = length(x)) for (i in 1:length(nus)){ for (j in 1:length(x)) { raw[i,j] <- besselRatio(x[j], nus[i], orderDiff = k) scaled[i,j] <- besselRatio(x[j], nus[i], orderDiff = k, useExpScaled = 1) compare[i,j] <- raw[i,j]/scaled[i,j] } } raw scaled compare
nus <- c(0:5, 10, 20) x <- seq(1, 4, length.out = 11) k <- 3 raw <- matrix(nrow = length(nus), ncol = length(x)) scaled <- matrix(nrow = length(nus), ncol = length(x)) compare <- matrix(nrow = length(nus), ncol = length(x)) for (i in 1:length(nus)){ for (j in 1:length(x)) { raw[i,j] <- besselRatio(x[j], nus[i], orderDiff = k) scaled[i,j] <- besselRatio(x[j], nus[i], orderDiff = k, useExpScaled = 1) compare[i,j] <- raw[i,j]/scaled[i,j] } } raw scaled compare
Given the parameters of a unimodal distribution and the root of the density function name, this function determines the range outside of which the density function is negligible, to a specified tolerance.
distCalcRange(densFn, param = NULL, tol = 10^(-5), ...)
distCalcRange(densFn, param = NULL, tol = 10^(-5), ...)
densFn |
Character. The name of the density function for which range calculation is required. |
tol |
Tolerance. |
param |
Numeric. A vector giving the parameter values for the
distribution specified by |
... |
Passes arguments to |
The name of the unimodal density function must be supplied as the
characters of the root for that density (e.g. norm
,
ghyp
). The particular unimodal distribution being considered is
specified by the values of the parameters or of the
param
vector.
The function gives a range, outside of which
the density is less than the given tolerance. It is used in
determining break points for the separate sections over which
numerical integration is used to determine the distribution
function. The points are found by using uniroot
on the
density function.
A two-component vector giving the lower and upper ends of the range.
David Scott [email protected], Joyce Li [email protected]
normRange <- distCalcRange("norm", tol = 10^(-7), mean = 4, sd = 1) normRange tRange <- distCalcRange("t", tol = 10^(-5), df = 4) tRange
normRange <- distCalcRange("norm", tol = 10^(-7), mean = 4, sd = 1) normRange tRange <- distCalcRange("t", tol = 10^(-5), df = 4) tRange
This function implements a test of the random number generator and distribution function based on an inequality due to Massart (1990).
distIneqMassart(densFn = "norm", n = 10000, probBound = 0.001, ...)
distIneqMassart(densFn = "norm", n = 10000, probBound = 0.001, ...)
densFn |
Character. The root name of the distribution to be tested. |
n |
Numeric. The size of the sample to be used. |
probBound |
Numeric. The value of the bound on the right hand side of the Massart inequality. See Details. |
... |
Additional arguments to allow specification of the parameters of the distribution. |
Massart (1990) gave a version of the Dvoretsky-Kiefer-Wolfowitz inequality with the best possible constant:
where is the empirical distribution function for
a sample of
independent and identically distributed random
variables with distribution function
. This inequality is true
for all distribution functions, for all
and
.
This test is used in base R to check the standard distribution
functions. The code may be found in the file p-r-random.tests.R
in the tests
directory.
sup |
Numeric. The supremum of the absolute difference between the empirical distribution and the true distribution function. |
probBound |
Numeric. The value of the bound on the right hand side of the Massart inequality. |
t |
Numeric. The lower bound which the supremum of the absolute difference between the empirical distribution and the true distribution function must exceed. |
pVal |
Numeric. The probability that the absolute difference
between the empirical distribution and the true distribution function
exceeds |
check |
Logical. Indicates whether the inequality is satisfied or not. |
David Scott [email protected], Christine Yang Dong [email protected]
Massart P. (1990) The tight constant in the Dvoretsky-Kiefer-Wolfovitz inequality. Ann. Probab., 18, 1269–1283.
## Normal distribution is the default distIneqMassart() ## Specify parameter values distIneqMassart(mean = 1, sd = 2) ## Gamma distribution has no default value for shape distIneqMassart("gamma", shape = 1)
## Normal distribution is the default distIneqMassart() ## Specify parameter values distIneqMassart(mean = 1, sd = 2) ## Gamma distribution has no default value for shape distIneqMassart("gamma", shape = 1)
Creates a Massart inequality plot for testing the empirical distribution and distribution function based on an inequality due to Massart (1990).
distIneqMassartPlot(densFn = "norm", param = NULL, nSamp = 50, n = 100, ...)
distIneqMassartPlot(densFn = "norm", param = NULL, nSamp = 50, n = 100, ...)
densFn |
Character. The root name of the distribution to be tested. |
n |
Numeric. The size of the sample to be used. |
nSamp |
Numeric. The number of samples used to approximate the LHS probability of the inequality. |
param |
Numeric. A vector giving the parameter values for the
distribution specified by |
... |
Passes the parameters of the distribution other than
specified by |
Massart (1990) gave a version of the Dvoretsky-Kiefer-Wolfowitz inequality with the best possible constant:
where is the empirical distribution function for
a sample of
independent and identically distributed random
variables with distribution function
. This inequality is true
for all distribution functions, for all
and
.
The red curve in the plot shows the LHS probabilities and the black curve gives the RHS bound. The red curve should lie below the black curve in order that the empirical distribution represents a sample from the theoretical distribution.
Returns NULL
invisibly.
David Scott [email protected], Xinxing Li [email protected]
Massart P. (1990) The tight constant in the Dvoretsky-Kiefer-Wolfovitz inequality. Ann. Probab., 18, 1269–1283.
## Not run: ### Not run because of timing requirements of CRAN ### The Massart Inequality plot for standard Normal Distribution distIneqMassartPlot() ### The Massart Inequality plot for Gamma Distribution distIneqMassartPlot("gamma", shape = 1) ## End(Not run)
## Not run: ### Not run because of timing requirements of CRAN ### The Massart Inequality plot for standard Normal Distribution distIneqMassartPlot() ### The Massart Inequality plot for Gamma Distribution distIneqMassartPlot("gamma", shape = 1) ## End(Not run)
Function to calculate the mode of a unimodal distribution which is specified by the root of the density function name and the corresponding parameters.
distMode(densFn, param = NULL, ...)
distMode(densFn, param = NULL, ...)
densFn |
Character. The name of the density function for which the mode is required. |
param |
Numeric. A vector giving the parameter values for the
distribution specified by |
... |
Passes arguments to |
The name of the unimodal density function must be supplied as the
characters of the root for that density (e.g. norm
,
ghyp
). The particular unimodal distribution being considered is
specified by the value of the argument param
, or for base R
distributions by specification in the ... arguments.
The mode is found by a numerical optimization using
optimize
.
David Scott [email protected], Joyce Li [email protected]
normRange <- distCalcRange("norm", tol = 10^(-7), mean = 4, sd = 1) curve(dnorm(x, mean = 4, sd = 1), normRange[1], normRange[2]) abline(v = distMode("norm", mean = 4, sd = 1), col = "blue")
normRange <- distCalcRange("norm", tol = 10^(-7), mean = 4, sd = 1) curve(dnorm(x, mean = 4, sd = 1), normRange[1], normRange[2]) abline(v = distMode("norm", mean = 4, sd = 1), col = "blue")
Given the parameters of a unimodal distribution and the root of the density function name, this function determines the step size when calculating the range of the specified unimodal distribution. The parameterization used is the one for the corresponding density function calculation.
distStepSize(densFn, dist, param = NULL, side = c("right","left"), ...)
distStepSize(densFn, dist, param = NULL, side = c("right","left"), ...)
densFn |
Character. The name of the density function for which the step size needs to be calculated. |
dist |
Numeric. Current distance value, for skew hyperbolic distribution only |
param |
Numeric. A vector giving the parameter values for the
distribution specified by |
side |
Character. |
... |
Passes arguments in particular the parameters of the distribution to random sample generation function. |
This function is used for stepping to the right or the left to obtain
an enclosing interval so uniroot
can be used to search. The
step size for the right tail is the absolute difference between the
median and upper quantile and for the left tail is the absolute
difference between the median and lower quantile. The skew hyperbolic
distribution however needs a special step size. When the tail is
declining exponentially the step is just a linear function of the
current distance from the mode. If the tail is declining only as a
power of , an exponential step is used.
distStepSize
is for internal use and is not expected to be
called by users. It is documented here for completeness.
The size of the step.
David Scott [email protected], Joyce Li [email protected]
normRange <- distCalcRange("norm", tol = 10^(-7), mean = 4, sd = 1) normRange tRange <- distCalcRange("t", tol = 10^(-5), df = 4) tRange
normRange <- distCalcRange("norm", tol = 10^(-7), mean = 4, sd = 1) normRange tRange <- distCalcRange("t", tol = 10^(-5), df = 4) tRange
Calculates the incomplete Bessel K function using the algorithm and code provided by Slavinsky and Safouhi (2009).
incompleteBesselK (x, y, nu, tol = .Machine$double.eps^0.85, nmax = 120) incompleteBesselKR(x, y, nu, tol = .Machine$double.eps^0.85, nmax = 120) SSFcoef(nmax, nu) combinatorial(nu) GDENOM(n, x, y, nu, An, nmax, Cnp) GNUM(n, x, y, nu, Am, An, nmax, Cnp, GM, GN)
incompleteBesselK (x, y, nu, tol = .Machine$double.eps^0.85, nmax = 120) incompleteBesselKR(x, y, nu, tol = .Machine$double.eps^0.85, nmax = 120) SSFcoef(nmax, nu) combinatorial(nu) GDENOM(n, x, y, nu, An, nmax, Cnp) GNUM(n, x, y, nu, Am, An, nmax, Cnp, GM, GN)
x |
Numeric. Value of the first argument of the incomplete Bessel K function. |
y |
Numeric. Value of the second argument of the incomplete Bessel K function. |
nu |
Numeric. The order of the incomplete Bessel K function. |
tol |
Numeric. The tolerance for the difference between successive approximations of the incomplete Bessel K function. |
nmax |
Integer. The maximum order allowed for the approximation of the incomplete Bessel K function. |
n |
Integer. Current order of the approximation. Not required to be specified by users. |
An |
Matrix of coefficients. Not required to be specified by users. |
Am |
Matrix of coefficients. Not required to be specified by users. |
Cnp |
Vector of elements of Pascal's triangle. Not required to be specified by users. |
GN |
Vector of denominators used for approximation. Not required to be specified by users. |
GM |
Vector of numerators used for approximation. Not required to be specified by users. |
The function incompleteBesselK
implements the algorithm
proposed by Slavinsky and Safouhi (2010) and uses code provided by
them.
The incomplete Bessel K function is defined by
see Slavinsky and Safouhi (2010), or Harris (2008).
incompleteBesselK
calls a Fortran routine to carry out the
calculations. incompleteBesselKR()
is a pure R version of the
routine for computing the incomplete Bessel K function.
The functions SSFcoef
, combinatorial
, GDENOM
, and
GNUM
are “subroutines”, i.e., auxiliary functions used in
incompleteBesselKR()
. They are not expected to be called by the
user.
The approximation to the incomplete Bessel K function returned by
incompleteBesselK
is highly accurate. The default value of
tol
is about 10^(-14) on a 32-bit computer. It appears that
even higher accuracy is possible when x > y
. Then the tolerance
can be taken as .Machine$double.eps
and the number of correct
figures essentially coincides with the number of figures representable
in the machine being used.
incompleteBesselKR
is very slow compared to the Fortran version
and is only included for those who wish to see the algorithm in R
rather than Fortran.
incompleteBesselK
and incompleteBesselKR
both return an
approximation to the incomplete Bessel K function as defined above.
The problem of calculation of the incomplete Bessel K function is
equivalent to the problem of calculation of the cumulative
distribution function of the generalized inverse Gaussian
distribution. See Generalized
Inverse Gaussian.
David Scott [email protected], Thomas Tran, Richard Slevinsky, Hassan Safouhi.
Harris, Frank E. (2008) Incomplete Bessel, generalized incomplete gamma, or leaky aquifer functions. J. Comp. Appl. Math. 215, 260–269.
Slevinsky, Richard M., and Safouhi, Hassan (2009) New formulae for higher order derivatives and applications. J. Comp. Appl. Math. 233, 405–419.
Slevinsky, Richard M., and Safouhi, Hassan (2010) A recursive algorithm for the G transformation and accurate computation of incomplete Bessel functions. Applied Numerical Mathematics 60(12), 1411–1417.
### Harris (2008) gives accurate values (16 figures) for ### x = 0.01, y = 4, and nu = 0:9 ### nu = 0, Harris value is 2.22531 07612 66469 options(digits = 16) incompleteBesselK(0.01, 4, 0) ### nu = 9, Harris value is 0.00324 67980 03149 incompleteBesselK(0.01, 4, 9) ### Other values given in Harris (2008) ### x = 4.95, y = 5.00, nu = 2 incompleteBesselK(4.95, 5, 2) ## 0.00001 22499 87981 ### x = 10, y = 2, nu = 6 ### Slevinsky and Safouhi (2010) suggest Harris (2008) value ### is incorrect, give value 0.00000 04150 01064 21228 incompleteBesselK(10, 2, 6) ### x = 3.1, y = 2.6, nu = 5 incompleteBesselK(3.1, 2.6, 5) ## 0.00052 85043 25244 ### Check values when x > y using numeric integration (numIBF <- sapply(0:9, incompleteBesselK, x = 4, y = 0.01)) besselFn <- function(t, x, y, nu) { (t^(-nu - 1))*exp(-x*t - y/t) } (intIBF <- sapply(0:9, integrate, f = besselFn, lower = 1, upper = Inf, x = 4, y = 0.01)) intIBF <- as.numeric(intIBF[1, ]) numIBF - intIBF max(abs(numIBF - intIBF)) ## 1.256649992398273e-11 options(digits = 7)
### Harris (2008) gives accurate values (16 figures) for ### x = 0.01, y = 4, and nu = 0:9 ### nu = 0, Harris value is 2.22531 07612 66469 options(digits = 16) incompleteBesselK(0.01, 4, 0) ### nu = 9, Harris value is 0.00324 67980 03149 incompleteBesselK(0.01, 4, 9) ### Other values given in Harris (2008) ### x = 4.95, y = 5.00, nu = 2 incompleteBesselK(4.95, 5, 2) ## 0.00001 22499 87981 ### x = 10, y = 2, nu = 6 ### Slevinsky and Safouhi (2010) suggest Harris (2008) value ### is incorrect, give value 0.00000 04150 01064 21228 incompleteBesselK(10, 2, 6) ### x = 3.1, y = 2.6, nu = 5 incompleteBesselK(3.1, 2.6, 5) ## 0.00052 85043 25244 ### Check values when x > y using numeric integration (numIBF <- sapply(0:9, incompleteBesselK, x = 4, y = 0.01)) besselFn <- function(t, x, y, nu) { (t^(-nu - 1))*exp(-x*t - y/t) } (intIBF <- sapply(0:9, integrate, f = besselFn, lower = 1, upper = Inf, x = 4, y = 0.01)) intIBF <- as.numeric(intIBF[1, ]) numIBF - intIBF max(abs(numIBF - intIBF)) ## 1.256649992398273e-11 options(digits = 7)
Given a density function specified by the root of the density function name, returns the integral over a specified range, usually the whole real line. Used for checking that the integral over the whole real line is 1.
integrateDens(densFn = "norm", lower = -Inf, upper = Inf, subdivisions = 100, ...)
integrateDens(densFn = "norm", lower = -Inf, upper = Inf, subdivisions = 100, ...)
densFn |
Character. The name of the density function to be integrated. |
lower |
Numeric. The lower limit of the integration.
Defaulty is |
upper |
Numeric. The upper limit of the integration.
Defaulty is |
subdivisions |
Numeric. The number of subdivisions to be passed
to |
... |
Additional arguments to be passed to |
The name of the density function to be integrated must be supplied as
the characters of the root for that density (e.g. norm
,
gamma
). The density function specified is integrated
numerically over the range specified via a call to
integrate
. The parameters of the distribution can be
specified, otherwise the default parameters will be used.
A list of class integrate
with components:
value |
The final estimate of the integral. |
abs.error |
Estimate of the modulus of the absolute error. |
subdivisions |
The number of subintervals produced in the subdivision process. |
message |
|
call |
The matched call to the |
David Scott [email protected]
integrateDens("norm", mean = 1, sd = 1) integrateDens("t", df = 4) integrateDens("exp", rate = 2) integrateDens("weibull", shape = 1)
integrateDens("norm", mean = 1, sd = 1) integrateDens("t", df = 4) integrateDens("exp", rate = 2) integrateDens("weibull", shape = 1)
Functions to check performance of distribution and quantile functions. Applying the distribution function followed by the quantile function to a set of numbers should reproduce the original set of numbers. Likewise applying the quantile function followed by the distribution function to numbers in the range (0,1) should produce the original numbers.
inversionTestpq(densFn = "norm", n = 10, intTol = .Machine$double.eps^0.25, uniTol = intTol, x = NULL, method = "spline", ...) inversionTestqp(densFn = "norm", qs = c(0.001, 0.01, 0.025, 0.05, 0.1, 0.2, 0.4, 0.5, 0.6, 0.8, 0.9, 0.95, 0.975, 0.99, 0.999), uniTol = .Machine$double.eps^0.25, intTol = uniTol, method = "spline", ...)
inversionTestpq(densFn = "norm", n = 10, intTol = .Machine$double.eps^0.25, uniTol = intTol, x = NULL, method = "spline", ...) inversionTestqp(densFn = "norm", qs = c(0.001, 0.01, 0.025, 0.05, 0.1, 0.2, 0.4, 0.5, 0.6, 0.8, 0.9, 0.95, 0.975, 0.99, 0.999), uniTol = .Machine$double.eps^0.25, intTol = uniTol, method = "spline", ...)
densFn |
Character. The root name of the distribution to be tested. |
qs |
Numeric. Set of quantiles to which quantile function then distribution function will be applied. See Details. |
n |
Numeric. Number of values to be sampled from the distribution. See Details. |
x |
Numeric. Values at which the distribution function is to be
evaluated. If |
intTol |
Value of |
uniTol |
Value of |
method |
Character. If |
... |
Additional arguments to allow specification of the parameters of the distribution. |
inversionTestpq
takes a sample from the specified distribution
of size then applies the distribution function, followed by
the quantile function.
inversionTestqp
applies the quantile function, followed by
the distribution function to the set of quantiles specified by
qs
.
In both cases the starting and ending values should be the same.
These tests are used in base R to check the standard distribution
functions. The code may be found in the file d-p-q-r.tests.R
in
the tests
directory.
inversionTestpq
returns a list with components:
qpx |
Numeric. The result of applying the distribution function
(‘p’ function) then the quantile function (‘q’ function) to
the randomly generated set of |
x |
Numeric. The set of |
diffs |
Numeric. The differences |
n |
Numeric. Number of values sampled from the distribution. |
inversionTestqp
returns a list with components:
pqqs |
Numeric. The result of applying the quantile function
(‘q’ function) then the distribution function (‘p’ function) to
the quantiles |
qs |
Numeric. The set of quantiles. |
diffs |
Numeric. The differences |
David Scott [email protected], Christine Yang Dong [email protected]
## Default distribution is normal inversionTestpq() inversionTestqp() ## Supply parameters inversionTestpq(mean = 1, sd = 2) inversionTestqp(mean = 1, sd = 2) ## Gamma distribution, must specify shape inversionTestpq("gamma", shape = 1) inversionTestqp("gamma", shape = 1)
## Default distribution is normal inversionTestpq() inversionTestqp() ## Supply parameters inversionTestpq(mean = 1, sd = 2) inversionTestqp(mean = 1, sd = 2) ## Gamma distribution, must specify shape inversionTestpq("gamma", shape = 1) inversionTestqp("gamma", shape = 1)
Checks whether an object is numeric and if so, are all the elements whole numbers, to a given tolerance.
is.wholenumber(x, tolerance = .Machine$double.eps^0.5)
is.wholenumber(x, tolerance = .Machine$double.eps^0.5)
x |
The object to be tested. |
tolerance |
Numeric |
The object x
is first tested to see if it is numeric. If not
the function returns 'FALSE'
. Then if all the elements of
x
are whole numbers to within the tolerance given by
tolerance
the function returns 'TRUE'
. If not it returns
'FALSE'
.
Either 'TRUE'
or 'FALSE'
depending on the result of the
test.
David Scott [email protected].
Based on a post by Tony Plate <[email protected]> on R-help.
is.wholenumber(-3:5) # TRUE is.wholenumber(c(0,0.1,1.3,5)) # FALSE is.wholenumber(-3:5 + .Machine$double.eps) # TRUE is.wholenumber(-3:5 + .Machine$double.eps^0.5) # FALSE is.wholenumber(c(2L,3L)) # TRUE is.wholenumber(c("2L","3L")) # FALSE is.wholenumber(0i ^ (-3:3)) # FALSE is.wholenumber(matrix(1:6, nrow = 3)) # TRUE is.wholenumber(list(-1:3,2:6)) # FALSE is.numeric(list(-1:3,2:6)) # FALSE is.wholenumber(unlist(list(-1:3,2:6))) # TRUE
is.wholenumber(-3:5) # TRUE is.wholenumber(c(0,0.1,1.3,5)) # FALSE is.wholenumber(-3:5 + .Machine$double.eps) # TRUE is.wholenumber(-3:5 + .Machine$double.eps^0.5) # FALSE is.wholenumber(c(2L,3L)) # TRUE is.wholenumber(c("2L","3L")) # FALSE is.wholenumber(0i ^ (-3:3)) # FALSE is.wholenumber(matrix(1:6, nrow = 3)) # TRUE is.wholenumber(list(-1:3,2:6)) # FALSE is.numeric(list(-1:3,2:6)) # FALSE is.wholenumber(unlist(list(-1:3,2:6))) # TRUE
Plots a log-histogram, as in for example Feiller, Flenley and Olbricht (1992).
The intended use of the log-histogram is to examine the fit of a particular density to a set of data, as an alternative to a histogram with a density curve. For this reason, only the log-density histogram is implemented, and it is not possible to obtain a log-frequency histogram.
The log-histogram can be plotted with histogram-like dashed vertical bars, or as points marking the tops of the log-histogram bars, or with both bars and points.
logHist(x, breaks = "Sturges", include.lowest = TRUE, right = TRUE, main = paste("Log-Histogram of", xName), xlim = range(breaks), ylim = NULL, xlab = xName, ylab = "Log-density", nclass = NULL, htype = "b", ...)
logHist(x, breaks = "Sturges", include.lowest = TRUE, right = TRUE, main = paste("Log-Histogram of", xName), xlim = range(breaks), ylim = NULL, xlab = xName, ylab = "Log-density", nclass = NULL, htype = "b", ...)
x |
A vector of values for which the log-histogram is desired. |
breaks |
One of:
In the last three cases the number is a suggestion only. |
include.lowest |
Logical. If |
right |
Logical. If |
main , xlab , ylab
|
These arguments to |
xlim |
Sensible default for the range of x values. |
ylim |
Calculated by |
nclass |
Numeric (integer). For compatibility with |
htype |
Type of histogram. Possible types are:
|
... |
Further graphical parameters for calls
to |
Uses hist.default
to determine the cells or classes and
calculate counts.
To calculate ylim
the following procedure is used. The upper
end of the range is given by the maximum value of the log-density,
plus 25% of the absolute value of the maximum. The lower end of the
range is given by the smallest (finite) value of the log-density, less
25% of the difference between the largest and smallest (finite) values
of the log-density.
A log-histogram in the form used by Feiller, Flenley and Olbricht (1992) is plotted. See also Barndorff-Nielsen (1977) for use of log-histograms.
Returns a list with components:
breaks |
The |
counts |
|
logDensity |
Log of If |
mids |
The |
xName |
A character string with the actual |
heights |
The location of the tops of the vertical segments used in drawing the log-histogram. |
ylim |
The value of |
David Scott [email protected], Richard Trendall, Thomas Tran
Barndorff-Nielsen, O. (1977) Exponentially decreasing distributions for the logarithm of particle size, Proc. Roy. Soc. Lond., A353, 401–419.
Barndorff-Nielsen, O. and Blæsild, P (1983). Hyperbolic distributions. In Encyclopedia of Statistical Sciences, eds., Johnson, N. L., Kotz, S. and Read, C. B., Vol. 3, pp. 700–707. New York: Wiley.
Fieller, N. J., Flenley, E. C. and Olbricht, W. (1992) Statistics of particle size data. Appl. Statist., 41, 127–146.
x <- rnorm(200) hist(x) ### default logHist(x) ### log histogram only logHist(x, htype = "h") ### points only, some options logHist(x, htype = "p", pch = 20, cex = 2, col = "steelblue")
x <- rnorm(200) hist(x) ### default logHist(x) ### log histogram only logHist(x, htype = "h") ### points only, some options logHist(x, htype = "p", pch = 20, cex = 2, col = "steelblue")
Using the moments up to a given order about one location, this function either returns the moments up to that given order about a new location as a vector or it returns a moment of a specific order defined by users (order <= maximum order of the given moments) about a new location as a single number. A generalization of using raw moments to obtain a central moment or using central moments to obtain a raw moment.
momChangeAbout(order = "all", oldMom, oldAbout, newAbout)
momChangeAbout(order = "all", oldMom, oldAbout, newAbout)
order |
One of:
|
oldMom |
Numeric. Moments of orders 1, 2, ..., about the point
|
oldAbout |
Numeric. The point about which the moments |
newAbout |
Numeric. The point about which the desired moment or moments are to be obtained. |
Suppose denotes the
-th moment of a random
variable
about a point
, and
denotes the
-th moment about
. Then
may be determined from the moments
according to the formula
This is the formula implemented by the function
momChangeAbout
. It is a generalization of the well-known
formulae used to change raw moments to central moments or to change
central moments to raw moments. See for example Kendall and Stuart
(1989), Chapter 3.
The moment of order order
about the location newAbout
when
order
is specified.
The vector of moments about the location newAbout
from first
order up to the maximum order of the oldMom
when order
takes the value "all"
or is not specified.
David Scott [email protected], Christine Yang Dong [email protected]
Kendall, M. G. and Stuart, A. (1969). The Advanced Theory of Statistics, Volume 1, 3rd Edition. London: Charles Griffin & Company.
### Gamma distribution k <- 4 shape <- 2 old <- 0 new <- 1 sampSize <- 1000000 ### Calculate 1st to 4th raw moments m <- numeric(k) for (i in 1:k){ m[i] <- gamma(shape + i)/gamma(shape) } m ### Calculate 4th moment about new momChangeAbout(k, m, old, new) ### Calculate 3rd about new momChangeAbout(3, m, old, new) ### Calculate 1st to 4th moments about new momChangeAbout(oldMom = m, oldAbout = old, newAbout = new) momChangeAbout(order = "all", m, old, new) ### Approximate kth moment about new using sampling x <- rgamma(sampSize, shape) mean((x - new)^k)
### Gamma distribution k <- 4 shape <- 2 old <- 0 new <- 1 sampSize <- 1000000 ### Calculate 1st to 4th raw moments m <- numeric(k) for (i in 1:k){ m[i] <- gamma(shape + i)/gamma(shape) } m ### Calculate 4th moment about new momChangeAbout(k, m, old, new) ### Calculate 3rd about new momChangeAbout(3, m, old, new) ### Calculate 1st to 4th moments about new momChangeAbout(oldMom = m, oldAbout = old, newAbout = new) momChangeAbout(order = "all", m, old, new) ### Approximate kth moment about new using sampling x <- rgamma(sampSize, shape) mean((x - new)^k)
Calculates moments and absolute moments about a given location for any given distribution.
momIntegrated(densFn = "ghyp", param = NULL, order, about = 0, absolute = FALSE, ...)
momIntegrated(densFn = "ghyp", param = NULL, order, about = 0, absolute = FALSE, ...)
densFn |
Character. The name of the density function whose moments are to be calculated. See Details. |
param |
Numeric. A vector giving the parameter values for the
distribution specified by |
order |
Numeric. The order of the moment or absolute moment to be calculated. |
about |
Numeric. The point about which the moment is to be calculated. |
absolute |
Logical. Whether absolute moments or ordinary moments
are to be calculated. Default is |
... |
Passes arguments to |
Denote the density function by . Then if
order
and
about
,
momIntegrated
calculates
when absolute = FALSE
and
when absolute = TRUE
.
The name of the density function must be supplied as the characters of
the root for that density (e.g. norm
, ghyp
).
When densFn="ghyp"
, densFn="hyperb"
,
densFn="gig"
or densFn = "vg"
, the relevant package must
be loaded or an error will result.
When densFn="invgamma"
or "inverse gamma"
the
density used is the density of the inverse gamma distribution given by
for ,
and
. The parameter vector
param = c(shape, rate)
where shape
and
rate
. The default value for
param
is c(-1, 1)
.
The value of the integral as specified in Details.
David Scott [email protected], Christine Yang Dong [email protected], Xinxing Li [email protected]
dghyp
,
dhyperb
, dgamma
,
dgig
,
VarianceGamma
require(GeneralizedHyperbolic) ### Calculate the mean of a generalized hyperbolic distribution ### Compare the use of integration and the formula for the mean m1 <- momIntegrated("ghyp", param = c(0, 1, 3, 1, 1 / 2), order = 1, about = 0) m1 ghypMean(param = c(0, 1, 3, 1, 1 / 2)) ### The first moment about the mean should be zero momIntegrated("ghyp", order = 1, param = c(0, 1, 3, 1, 1 / 2), about = m1) ### The variance can be calculated from the raw moments m2 <- momIntegrated("ghyp", order = 2, param = c(0, 1, 3, 1, 1 / 2), about = 0) m2 m2 - m1^2 ### Compare with direct calculation using integration momIntegrated("ghyp", order = 2, param = c(0, 1, 3, 1, 1 / 2), about = m1) momIntegrated("ghyp", param = c(0, 1, 3, 1, 1 / 2), order = 2, about = m1) ### Compare with use of the formula for the variance ghypVar(param = c(0, 1, 3, 1, 1 / 2))
require(GeneralizedHyperbolic) ### Calculate the mean of a generalized hyperbolic distribution ### Compare the use of integration and the formula for the mean m1 <- momIntegrated("ghyp", param = c(0, 1, 3, 1, 1 / 2), order = 1, about = 0) m1 ghypMean(param = c(0, 1, 3, 1, 1 / 2)) ### The first moment about the mean should be zero momIntegrated("ghyp", order = 1, param = c(0, 1, 3, 1, 1 / 2), about = m1) ### The variance can be calculated from the raw moments m2 <- momIntegrated("ghyp", order = 2, param = c(0, 1, 3, 1, 1 / 2), about = 0) m2 m2 - m1^2 ### Compare with direct calculation using integration momIntegrated("ghyp", order = 2, param = c(0, 1, 3, 1, 1 / 2), about = m1) momIntegrated("ghyp", param = c(0, 1, 3, 1, 1 / 2), order = 2, about = m1) ### Compare with use of the formula for the variance ghypVar(param = c(0, 1, 3, 1, 1 / 2))
Calculates the approximate standard error of the sample variance, sample central third moment and sample central fourth moment.
momSE(order = 4, n, mom)
momSE(order = 4, n, mom)
order |
Integer: either 2, 3, or 4. |
n |
Integer: the sample size. |
mom |
Numeric: The central moments of order 1 to |
Implements the approximate standard error given in Kendall and Stuart (1969), p.243.
The approximate standard error of the sample moment specified.
David Scott [email protected]
Kendall, M. G. and Stuart, A. (1969). The Advanced Theory of Statistics, Volume 1, 3rd Edition. London: Charles Griffin & Company.
### Moments of the normal distribution, mean 1, variance 4 mu <- 1 sigma <- 2 mom <- c(0,sigma^2,0,3*sigma^4,0,15*sigma^6,0,105*sigma^8) ### standard error of sample variance momSE(2, 100, mom[1:4]) ### should be sqrt(2*sigma^4)/10 ### standard error of sample central third moment momSE(3, 100, mom[1:6]) ### should be sqrt(6*sigma^6)/10 ### standard error of sample central fourth moment momSE(4, 100, mom) ### should be sqrt(96*sigma^8)/10
### Moments of the normal distribution, mean 1, variance 4 mu <- 1 sigma <- 2 mom <- c(0,sigma^2,0,3*sigma^4,0,15*sigma^6,0,105*sigma^8) ### standard error of sample variance momSE(2, 100, mom[1:4]) ### should be sqrt(2*sigma^4)/10 ### standard error of sample central third moment momSE(3, 100, mom[1:6]) ### should be sqrt(6*sigma^6)/10 ### standard error of sample central fourth moment momSE(4, 100, mom) ### should be sqrt(96*sigma^8)/10
This function implements a goodness-of-fit test using Moran's log spacings statistic.
moranTest(x, densFn, param = NULL, ...)
moranTest(x, densFn, param = NULL, ...)
densFn |
Character. The root name of the distribution to be tested. |
x |
Numeric. Vector of data to be tested. |
param |
Numeric. A vector giving the parameter values for the
distribution specified by |
... |
Additional arguments to allow specification of the
parameters of the distribution other than specified by |
Moran(1951) gave a statistic for testing the goodness-of-fit of a
random sample of -values to a continuous univariate
distribution with cumulative distribution function
, where
is a vector of known
parameters. This function implements the Cheng and Stephens(1989)
extended Moran test for unknown parameters.
The test statistic is
Where , the Moran statistic, is
M(theta)=-(log(y_1-y_0)+log(y_2-y_1)+...+log(y_m-y_m-1))
This test has null hypothesis:
: a random sample of
values of
comes
from distribution
, where
is the vector of parameters.
Here
is expected to be the maximum
likelihood estimate
, an efficient
estimate. The test rejects
at significance level
if
>
.
statistic |
Numeric. The value of the Moran test statistic. |
estimate |
Numeric. A vector of parameter estimates for the tested distribution. |
parameter |
Numeric. The degrees of freedom for the Moran statistic. |
p.value |
Numeric. The p-value for the test |
.
data.name |
Character. A character string giving the name(s) of the data. |
method |
Character. Type of test performed. |
David Scott [email protected], Xinxing Li [email protected]
Cheng, R. C. & Stephens, M. A. (1989). A goodness-of-fit test using Moran's statistic with estimated parameters. Biometrika, 76, 385–92.
Moran, P. (1951). The random division of an interval—PartII. J. Roy. Statist. Soc. B, 13, 147–50.
### Normal Distribution x <- rnorm(100, mean = 0, sd = 1) muhat <- mean(x) sigmahat <- sqrt(var(x)*(100 - 1)/100) result <- moranTest(x, "norm", mean = muhat, sd = sigmahat) result ### Exponential Distribution y <- rexp(200, rate = 3) lambdahat <- 1/mean(y) result <- moranTest(y, "exp", rate = lambdahat) result
### Normal Distribution x <- rnorm(100, mean = 0, sd = 1) muhat <- mean(x) sigmahat <- sqrt(var(x)*(100 - 1)/100) result <- moranTest(x, "norm", mean = muhat, sd = sigmahat) result ### Exponential Distribution y <- rexp(200, rate = 3) lambdahat <- 1/mean(y) result <- moranTest(y, "exp", rate = lambdahat) result
Given the density function of a unimodal distribution specified by the root of the density function name, returns the distribution function and quantile function of the specified distribution.
pDist(densFn = "norm", q, param = NULL, subdivisions = 100, lower.tail = TRUE, intTol = .Machine$double.eps^0.25, valueOnly = TRUE, ...) qDist(densFn = "norm", p, param = NULL, lower.tail = TRUE, method = "spline", nInterpol = 501, uniTol = .Machine$double.eps^0.25, subdivisions = 100, intTol = uniTol, ...)
pDist(densFn = "norm", q, param = NULL, subdivisions = 100, lower.tail = TRUE, intTol = .Machine$double.eps^0.25, valueOnly = TRUE, ...) qDist(densFn = "norm", p, param = NULL, lower.tail = TRUE, method = "spline", nInterpol = 501, uniTol = .Machine$double.eps^0.25, subdivisions = 100, intTol = uniTol, ...)
densFn |
Character. The name of the density function for which the distribution function or quantile function is required. |
q |
Vector of quantiles. |
p |
Vector of probabilities. |
param |
Numeric. A vector giving the parameter values for the
distribution specified by |
method |
Character. If |
lower.tail |
Logical. If |
subdivisions |
The maximum number of subdivisions used to integrate the density and determine the accuracy of the distribution function calculation. |
intTol |
Value of |
valueOnly |
Logical. If |
nInterpol |
Number of points used in |
uniTol |
Value of |
... |
Passes additional arguments to |
The name of the unimodal density function must be supplied as the
characters of the root for that density (e.g. norm
, ghyp
).
pDist
uses the function integrate
to numerically
integrate the density function specified. The integration is from
-Inf
to x
if x
is to the left of the mode, and from
x
to Inf
if x
is to the right of the mode. The
probability calculated this way is subtracted from 1 if
required. Integration in this manner appears to make calculation of the
quantile function more stable in extreme cases.
qDist
provides two methods to calculate quantiles both of which
use uniroot
to find the value of for which a given
is equal to
where
denotes the distribution
function. The difference is in how the numerical approximation to
is obtained. The more accurate method, which is specified as
"integrate"
, is to calculate the value of whenever it
is required using a call to
pDist
. It is clear that the time
required for this approach is roughly linear in the number of quantiles
being calculated. The alternative (and default) method is that for the
major part of the distribution a spline approximation to is
calculated and quantiles found using
uniroot
with this
approximation. For extreme values of some heavy-tailed distributions
(where the tail probability is less than ), the
integration method is still used even when the method specified as
"spline"
.
If accurate probabilities or quantiles are required, tolerances
(intTol
and uniTol
) should be set to small values, i.e
or
with
method = "integrate"
. Generally then accuracy might be expected
to be at least . If the default values of the
functions are used, accuracy can only be expected to be around
. Note that on 32-bit systems
.Machine$double.eps^0.25 = 0.0001220703
is a typical value.
pDist
gives the distribution function, qDist
gives the
quantile function.
An estimate of the accuracy of the approximation to the distribution
function can be found by setting valueOnly = FALSE
in the call to
pDist
which returns a list with components value
and
error
.
David Scott [email protected] Joyce Li [email protected]
pDist("norm", q = 2, mean = 1, sd = 1) pDist("t", q = 0.5, df = 4) require(GeneralizedHyperbolic) pDist("ghyp", q = 0.1) require(SkewHyperbolic) qDist("skewhyp", p = 0.4, param = c(0, 1, 0, 10)) qDist("t", p = 0.2, df = 4)
pDist("norm", q = 2, mean = 1, sd = 1) pDist("t", q = 0.5, df = 4) require(GeneralizedHyperbolic) pDist("ghyp", q = 0.1) require(SkewHyperbolic) qDist("skewhyp", p = 0.4, param = c(0, 1, 0, 10)) qDist("t", p = 0.2, df = 4)
Adaptive quadrature of functions of one variable over a finite or infinite interval.
safeIntegrate(f, lower, upper, subdivisions=100, rel.tol = .Machine$double.eps^0.25, abs.tol = rel.tol, stop.on.error = TRUE, keep.xy = FALSE, aux = NULL, ...)
safeIntegrate(f, lower, upper, subdivisions=100, rel.tol = .Machine$double.eps^0.25, abs.tol = rel.tol, stop.on.error = TRUE, keep.xy = FALSE, aux = NULL, ...)
f |
An R function taking a numeric first argument and returning a numeric vector of the same length. Returning a non-finite element will generate an error. |
lower , upper
|
The limits of integration. Can be infinite. |
subdivisions |
The maximum number of subintervals. |
rel.tol |
Relative accuracy requested. |
abs.tol |
Absolute accuracy requested. |
stop.on.error |
Logical. If true (the default) an error stops the
function. If false some errors will give a result with a warning in
the |
keep.xy |
Unused. For compatibility with S. |
aux |
Unused. For compatibility with S. |
... |
Additional arguments to be passed to |
This function is just a wrapper around
integrate
to check for equality of
upper
and lower
. A check is made using
all.equal
. When numerical equality is
detected, if lower
(and hence upper
) is infinite, the
value of the integral and the absolute error are both set to 0. When
lower
is finite, the value of the integral is set to 0, and the
absolute error to the average of the function values at upper
and
lower
times the difference between upper
and lower
.
When upper
and lower
are determined to be different, the
result is exactly as given by integrate
.
A list of class "integrate"
with components:
value |
The final estimate of the integral. |
abs.error |
Estimate of the modulus of the absolute error. |
subdivisions |
The number of subintervals produced in the subdivision process. |
message |
|
call |
The matched call. |
The function integrate
and
all.equal
.
integrate(dnorm, -1.96, 1.96) safeIntegrate(dnorm, -1.96, 1.96) # Same as for integrate() integrate(dnorm, -Inf, Inf) safeIntegrate(dnorm, -Inf, Inf) # Same as for integrate() integrate(dnorm, 1.96, 1.96) # OK here but can give an error safeIntegrate(dnorm, 1.96, 1.96) integrate(dnorm, -Inf, -Inf) safeIntegrate(dnorm, -Inf, -Inf) # Avoids nonsense answer integrate(dnorm, Inf, Inf) safeIntegrate(dnorm, Inf, Inf) # Avoids nonsense answer
integrate(dnorm, -1.96, 1.96) safeIntegrate(dnorm, -1.96, 1.96) # Same as for integrate() integrate(dnorm, -Inf, Inf) safeIntegrate(dnorm, -Inf, Inf) # Same as for integrate() integrate(dnorm, 1.96, 1.96) # OK here but can give an error safeIntegrate(dnorm, 1.96, 1.96) integrate(dnorm, -Inf, -Inf) safeIntegrate(dnorm, -Inf, -Inf) # Avoids nonsense answer integrate(dnorm, Inf, Inf) safeIntegrate(dnorm, Inf, Inf) # Avoids nonsense answer
Computes the sample skewness and sample kurtosis.
skewness(x, na.rm = FALSE) kurtosis(x, na.rm = FALSE)
skewness(x, na.rm = FALSE) kurtosis(x, na.rm = FALSE)
x |
A numeric vector containing the values whose skewness or kurtosis is to be computed. |
na.rm |
A logical value indicating whether |
If , then the skewness of
is defined as
If , then the kurtosis of
is defined as
The skewness or kurtosis of x
.
These functions and the description of them are taken from the package e1071. They are included to avoid having to require an additional package.
Evgenia Dimitriadou, Kurt Hornik, Friedrich Leisch, David Meyer, and Andreas Weingessel
x <- rnorm(100) skewness(x) kurtosis(x)
x <- rnorm(100) skewness(x) kurtosis(x)
Create a left or right tail plot of a data set using
tailPlot
. Add a line for any distribution with parameters given
by an argument named param
, using tailPlotLine
.
Add normal, , or gamma distribution lines to
the plot using
normTailPlotLine
, tTailPlotLine
, or
gammaTailPlotLine
tailPlot(x, log = "y", side = c("right", "left"), main = NULL, xlab = NULL, ylab = NULL, ...) tailPlotLine(x, distrFn, param = NULL, side = c("right", "left"), ...) normTailPlotLine(x, mean = 0, sd = 1, side = c("right", "left"), ...) tTailPlotLine(x, df = Inf, side = c("right", "left"), ...) gammaTailPlotLine(x, shape = 1, rate = 1, scale = 1/rate, side = c("right", "left"), ...)
tailPlot(x, log = "y", side = c("right", "left"), main = NULL, xlab = NULL, ylab = NULL, ...) tailPlotLine(x, distrFn, param = NULL, side = c("right", "left"), ...) normTailPlotLine(x, mean = 0, sd = 1, side = c("right", "left"), ...) tTailPlotLine(x, df = Inf, side = c("right", "left"), ...) gammaTailPlotLine(x, shape = 1, rate = 1, scale = 1/rate, side = c("right", "left"), ...)
x |
A vector of values for which the tail plot is to be drawn. |
log |
A character string which contains |
side |
Character. |
main |
A main title for the plot. |
xlab |
A label for the x axis, defaults to |
ylab |
A label for the y axis, defaults to |
distrFn |
Character. The name of the distribution function to be to be added to the tail plot. |
param |
Vector specifying the parameters of the distribution,
defaults to |
mean |
The mean of the normal distribution. |
sd |
The standard deviation of the normal distribution. Must be positive. |
df |
The degrees of freedom of the |
shape |
The shape parameter of the gamma distribution. Must be positive. |
scale |
The scale parameter of the gamma distribution. Must be
strictly positive, |
rate |
The rate parameter of the gamma distribution. An alternative way to specify the scale. |
... |
Other graphical parameters (see |
tailPlot
draws either a left-hand or right-hand tail plot of
the data x
. See for example Resnick (2007), p.105. The
left-hand tail plot plots the empirical distribution of the data
against the order statistics, for order statistic values below the
median. The right-hand tail plot plots one minus the empirical
distribution of the data against the order statistics, for order
statistic values above the median. The default is for the y-axis to be
plotted on a log scale.
tailPlotLine
adds a line for the specified distribution to an
already drawn tail plot. The distribution can be any distribution
which has default parameters, but if parameters need to be supplied
the distribution must have an argument param
which specifies
the parameters. This is the case for all distributions in the form
recommended in Scott et al (2009) and includes distributions
from the packages GeneralizedHyperbolic, SkewHyperbolic,
VarianceGamma and NormalLaplace (which is on R-Forge).
normTailPlotLine
,tTailPlotLine
and
gammaTailPlotLine
add the corresponding line
derived respectively from the given normal, , or gamma
distribution to an already drawn tail plot.
Returns NULL
invisibly.
David Scott [email protected]
Aas, Kjersti and Hobæk Haff, Ingrid (2006)
The generalised hyperbolic skew Student's -distribution.
Journal of Financial Econometrics, 4, 275–309.
Resnick, S. (2007) Heavy-Tail Phenomena, New York: Springer.
Scott, David J. and Würtz, Diethelm and Dong, Christine (2009) Software for Distributions in R. UseR: The R User Conference 2009 https://www.r-project.org/conferences/useR-2009/slides/Scott+Wuertz+Dong.pdf
### Draw tail plot of some data x <- rnorm(100, 1, 2) tailPlot(x) ### Add normal distribution line normTailPlotLine(x, mean = 1, sd = 2) ### Add t distribution line tTailPlotLine(x, df = 5, lty = 2) ### Use fitted values normTailPlotLine(x, mean = mean(x), sd = sd(x), lty = 3) ### Gamma distribution x <- rgamma(100, shape = 1, scale = 1) tailPlot(x) ### Add gamma distribution line gammaTailPlotLine(x, shape = 1, scale = 1) ### Left tail example tailPlot(x, side = "l") ### Add gamma distribution line gammaTailPlotLine(x, shape = 1, scale = 1, side = "l") ### Log scale on both axes tailPlot(x, side = "l", log = "xy") ### Add gamma distribution line gammaTailPlotLine(x, shape = 1, scale = 1, side = "l") ### Add line from a standard distribution with default parameters x <- rlnorm(100) tailPlot(x) tailPlotLine(x, distrFn = "lnorm") ### Add line from a distribution with 'param' argument require(VarianceGamma) param <- c(0,0.5,0,0.5) x <- rvg(100, param = param) tailPlot(x) tailPlotLine(x, distrFn = "vg", param = param)
### Draw tail plot of some data x <- rnorm(100, 1, 2) tailPlot(x) ### Add normal distribution line normTailPlotLine(x, mean = 1, sd = 2) ### Add t distribution line tTailPlotLine(x, df = 5, lty = 2) ### Use fitted values normTailPlotLine(x, mean = mean(x), sd = sd(x), lty = 3) ### Gamma distribution x <- rgamma(100, shape = 1, scale = 1) tailPlot(x) ### Add gamma distribution line gammaTailPlotLine(x, shape = 1, scale = 1) ### Left tail example tailPlot(x, side = "l") ### Add gamma distribution line gammaTailPlotLine(x, shape = 1, scale = 1, side = "l") ### Log scale on both axes tailPlot(x, side = "l", log = "xy") ### Add gamma distribution line gammaTailPlotLine(x, shape = 1, scale = 1, side = "l") ### Add line from a standard distribution with default parameters x <- rlnorm(100) tailPlot(x) tailPlotLine(x, distrFn = "lnorm") ### Add line from a distribution with 'param' argument require(VarianceGamma) param <- c(0,0.5,0,0.5) x <- rvg(100, param = param) tailPlot(x) tailPlotLine(x, distrFn = "vg", param = param)
Calculates an approximation to the Hessian of a function. Used for obtaining an approximation to the information matrix for maximum likelihood estimation.
tsHessian(param, fun, ...)
tsHessian(param, fun, ...)
param |
Numeric. The Hessian is to be evaluated at this point. |
fun |
A function of the parameters specified by |
... |
Values of other parameters of the function |
As a typical statistical application, the function fun
is the
log-likelihood function, param
specifies the maximum likelihood
estimates of the parameters of the distribution, and the data
constitutes the other parameter values required for determination of
the log-likelihood function.
The approximate Hessian matrix of the function fun
where
differentiation is with respect to the vector of parameters
param
at the point given by the vector param
.
This code was borrowed from the fBasics function, in the file ‘utils-hessian.R’ with slight modification. This was in turn borrowed from Kevin Sheppard's Matlab garch toolbox as implemented by Alexios Ghalanos in his rgarch package.
David Scott [email protected], Christine Yang Dong [email protected]
hyperbHessian
and summary.hyperbFit
in
GeneralizedHyperbolic.
### Consider Hessian of log(1 + x + 2y) ### Example from Lang: A Second Course in Calculus, p.74 fun <- function(param){ x <- param[1] y <- param[2] return(log(1 + x + 2*y)) } ### True value of Hessian at (0,0) trueHessian <- matrix( c(-1,-2, -2,-4), byrow = 2, nrow = 2) trueHessian ### Value from tsHessian approxHessian <- tsHessian(c(0,0), fun = fun) approxHessian maxDiff <- max(abs(trueHessian - approxHessian)) ### Should be approximately 0.045 maxDiff
### Consider Hessian of log(1 + x + 2y) ### Example from Lang: A Second Course in Calculus, p.74 fun <- function(param){ x <- param[1] y <- param[2] return(log(1 + x + 2*y)) } ### True value of Hessian at (0,0) trueHessian <- matrix( c(-1,-2, -2,-4), byrow = 2, nrow = 2) trueHessian ### Value from tsHessian approxHessian <- tsHessian(c(0,0), fun = fun) approxHessian maxDiff <- max(abs(trueHessian - approxHessian)) ### Should be approximately 0.045 maxDiff