The mev
package was
originally introduced to implement the exact unconditional sampling
algorithms in Dombry, Engelke, and Oesting
(2016). The two algorithms therein allow one to simulate simple
max-stable random vectors. The implementation will work efficiently for
moderate dimensions.
There are two main functions, rmev
and
rmevspec
. rmev
samples from simple max-stable
processes, meaning it will return an n × d matrix of samples,
where each of the column has a sample from a unit Frechet distribution.
In constrast, rmevspec
returns sample on the unit simplex
from the spectral (or angular) measure. One could use this to test
estimation based on spectral densities, or to construct samples from
Pareto processes.
The syntax is
library(mev)
#Sample of size 1000 from a 5-dimensional logistic model
x <- rmev(n=1000, d=5, param=0.5, model="log")
#Marginal parameters are all standard Frechet, meaning GEV(1,1,1)
apply(x, 2, function(col){ismev::gev.fit(col, show=FALSE)$mle})
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.9914788 0.9788916 0.9795293 0.9570961 0.9907859
## [2,] 0.9974789 1.0069399 1.0083998 0.9776768 0.9943595
## [3,] 1.0145913 1.0674027 1.0874606 1.0888960 1.0225028
#Sample from the corresponding spectral density
w <- rmevspec(n=1000, d=5, param=0.5, model="log")
#All rows sum to 1 by construction
head(rowSums(w))
## [1] 1 1 1 1 1 1
## [1] 0.20 0.20 0.19 0.21 0.20
The different models implemented are described in Dombry, Engelke, and Oesting (2016), but some other models can be found and are described here. Throughout, we consider d-variate models and let 𝔹d be the collection of all nonempty subsets of {1, …, d}.
The logistic model (log
) of Gumbel (1960) has distribution function for
α > 1. The spectral measure
density is
The alog
model was proposed by Tawn (1990). It shares the same parametrization
as the evd
package, merely replacing the algorithm for the
generation of logistic variates. The distribution function of the d-variate asymmetric logistic
distribution is
The parameters θi, b must be provided in a list and represent the asymmetry parameter. The sampling algorithm, from Stephenson (2003) gives some insight on the construction mechanism as a max-mixture of logistic distributions. Consider sampling Zb from a logistic distribution of dimension |b| (or Fréchet variates if |b| = 1) with parameter αb (possibly recycled). Each marginal value corresponds to the maximum of the weighted corresponding entry. That is, Xi = maxb ∈ 𝔹dθi, bZi, b for all i = 1, …, d. The max-mixture is valid provided that ∑b ∈ 𝔹dθi, b = 1 for i = 1, …, d. As such, empirical estimates of the spectral measure will almost surely place mass on the inside of the simplex rather than on subfaces.
The neglog
distribution function due to Galambos (1975) is for α ≥ 0 (Dombry, Engelke, and Oesting 2016). The
associated spectral density is
The asymmetric negative logistic (aneglog
) model is
alluded to in Joe (1990) as a
generalization of the Galambos model. It is constructed in the same way
as the asymmetric logistic distribution; see Theorem~1 in Stephenson (2003). Let αb ≤ 0 for all
b ∈ 𝔹d and
θi, b ≥ 0
with ∑b ∈ 𝔹dθi, b = 1
for i = 1, …, d; the
distribution function is In particular, it does not correspond to the
``negative logistic distribution’’ given in e.g., Section 4.2 of Coles and Tawn (1991) or Section3.5.3 of Kotz and Nadarajah (2000). The latter is not a
valid distribution function in dimension d ≥ 3 as the constraints therein on
the parameters θi, b
are necessary, but not sufficient.
Joe (1990) mentions generalizations of
the distribution as given above but the constraints were not enforced
elsewhere in the literature. The proof that the distribution is valid
follows from Theorem~1 of Stephenson
(2003) as it is a max-mixture. Note that the parametrization of
the asymmetric negative logistic distribution does not match the
bivariate implementation of rbvevd
.
This multivariate extension of the logistic, termed multilogistic
(bilog
) proposed by M.-O. Boldi
(2009), places mass on the interior of the simplex. Let W ∈ 𝕊d
be the solution of where Cj = Γ(d − αj)/Γ(1 − αj)
for j = 1, …, d and
U ∈ 𝕊d
follows a d-mixture of
Dirichlet with the jth
component being 𝒟(1 − δjαj),
so that the mixture has density function for 0 < αj < 1, j = 1, …, d.
The spectral density of the multilogistic distribution is thus for αj ∈ (0, 1)
(j = 1, …, d).
The Dirichlet (ct
) model of Coles
and Tawn (1991) for αj > 0.
The angular density of the scaled extremal Dirichlet
(sdir
) model with parameters ρ > −min (α) and
α ∈ ℝ+d
is given, for all w ∈ 𝕊d,
by where c(α, ρ)
is the d-vector with entries
Γ(αi + ρ)/Γ(αi)
for i = 1, …, d and
⟨⋅, ⋅⟩ denotes the inner product
between two vectors.
The Huesler–Reiss model (hr
), due to Hüsler and Reiss (1989), is a special case of
the Brown–Resnick process. While Engelke et al.
(2015) state that H"usler–Reiss variates can be sampled following
the same scheme, the spatial analog is conditioned on a particular site
(s0), which
complicates the comparisons with the other methods.
Let I−j = {1, …, d} \ {j} and λij2 ≥ 0 be entries of a strictly conditionally negative definite matrix Λ, for which λij2 = λji2. Then, following Nikoloulopoulos, Joe, and Li (2009) (Remark~2.5) and Huser and Davison (2013), we can write the distribution function as where the partial correlation matrix Σ−j has elements and λii = 0 for all i ∈ I−j so that the diagonal entries 𝜚i, i; j = 1. Engelke et al. (2015) uses the covariance matrix with entries are 𝜍 = 2(λij2 + λkj2 − λik2), so the resulting expression is evaluated at 2λ.j2 − log (xj/x−j) instead. We recover the same expression by standardizing, since this amounts to division by the standard deviations 2λ.j
The package implementation has a bivariate implementation of the H"usler–Reiss distribution with dependence parameter r, with rik = 1/λik or $2/r=\sqrt{2\gamma(\boldsymbol{h})}$ for h = ∥si − si∥ for the Brown–Resnick model. In this setting, it is particularly easy since the only requirement is non-negativity of the parameter. For inference in dimension d > 2, one needs to impose the constraint Λ = {λij2}i, j = 1d ∈ 𝒟 (cf. Engelke et al. (2015), p.3), where denotes the set of symmetric conditionally negative definite matrices with zero diagonal entries. An avenue to automatically satisfy these requirements is to optimize over a symmetric positive definite matrix parameter 𝛴 = L⊤L, where L is an upper triangular matrix whose diagonal element are on the log-scale to ensure uniqueness of the Cholesky factorization; see Pinheiro and Bates (1996). By taking one can perform unconstrained optimization for the non-zero elements of L which are in one-to-one correspondence with those of Λ.
It easily follows that generating Z from a d − 1 dimensional log-Gaussian distribution with covariance Co(Zi, Zk) = 2(λij2 + λkj2 − λik2) for i, k ∈ I−j with mean vector −2λ•j2 gives the finite dimensional analog of the Brown–Resnick process in the mixture representation of Dombry, Engelke, and Oesting (2016).
The function checks conditional negative definiteness of the matrix. The easiest way to do so negative definiteness of Λ with real entries is to form $\tilde{\boldsymbol{\Lambda}}=\mathbf{P}\boldsymbol{\Lambda}\mathbf{P}^\top$, where P is an d × d matrix with ones on the diagonal, −1 on the (i, i + 1) entries for i = 1, …d − 1 and zeros elsewhere. If the matrix Λ ∈ 𝒟, then the eigenvalues of the leading (d − 1) × (d − 1) submatrix of $\tilde{\boldsymbol{\Lambda}}$ will all be negative.
For a set of d locations, one can supply the variogram matrix as valid input to the method.
The Brown–Resnick process (br
) is the functional
extension of the H"usler–Reiss distribution, and is a max-stable process
associated with the log-Gaussian distribution. It is often in the
spatial setting conditioned on a location (typically the origin). Users
can provide a variogram function that takes distance as argument and is
vectorized. If vario
is provided, the model will simulate
from an intrinsically stationary Gaussian process. The user can
alternatively provide a covariance matrix sigma
obtained by
conditioning on a site, in which case simulations are from a stationary
Gaussian process. See Engelke et al.
(2015) or Dombry, Engelke, and Oesting
(2016) for more information.
The extremal Student (extstud
) model of Nikoloulopoulos, Joe, and Li (2009), eq. 2.8,
with unit Fréchet margins is where Td − 1 is the
distribution function of the $d-1 $ dimensional Student-t distribution and the partial
correlation matrix R−j has
diagonal entry $$r_{i,i;j}=1, \qquad
r_{i,k;j}=\frac{\rho_{ik}-\rho_{ij}\rho_{kj}}{\sqrt{1-\rho_{ij}^2}\sqrt{1-\rho_{kj}^2}}$$
for i ≠ k, i, k ∈ I−j.
The user must provide a valid correlation matrix (the function checks for diagonal elements), which can be obtained from a variogram.
The Dirichlet mixture (dirmix
) proposed by M.-O. Boldi and Davison (2007), see Dombry, Engelke, and Oesting (2016) for details
on the mixture. The spectral density of the model is The argument
param
is thus a d × m matrix of
coefficients, while the argument for the m-vector weights
gives
the relative contribution of each Dirichlet mixture component.
The Smith model (smith
) is from the unpublished report
of Smith (1990). It corresponds to a
moving maximum process on a domain 𝕏.
The de Haan representation of the process is where {ζi, ηi}i ∈ ℕ
is a Poisson point process on ℝ+ × 𝕏 with intensity measure
ζ−2dζdη
and h is the density of the
multivariate Gaussian distribution. Other h could be used in principle, but
are not implemented.