This package implements a spectral approach to the parametric
estimation of Hawkes processes from Binned Observations through Whittle
likelihood (HawkesBOW). It is based on the results of the article (Cheysson and Lang 2020).
Installation
You can install the released version of hawkesbow from CRAN with:
install.packages("hawkesbow")
Alternatively, you can install the up-to-date version of hawkesbow
from GitHub
with:
devtools::install_github("fcheysson/hawkesbow")
Details of the package
The Hawkes process
Hawkes processes form a family of models for point processes for
which the occurrence of an event temporarily increases the probability
of future events occurring (Alan G. Hawkes
1971). Formally, a Hawkes process N on ℝ can be defined from its conditional
intensity λ(⋅): where the
variables Ti denote the
arrival times of the process, the immigration intensity η is a positive constant, and the
reproduction function h : ℝ ≥ 0 → ℝ ≥ 0
is a measurable function. The reproduction function can be further
decomposed as h = μh*,
where μ = ∫ℝh(t)dt < 1
is called the and h* is a true density
function, ∫ℝh*(t)dt = 1,
called the .
Alternatively, the Hawkes process can be constructed as a poissonian
cluster process (Alan G. Hawkes and Oakes
1974). The process consists of a flow of immigrants, the
cluster centres, arriving according to a homogeneous Poisson process of
intensity η. Then, an
immigrant arriving at time Ti generates
children according to an inhomogeneous Poisson process of
intensity h(⋅ − Ti).
These in turn independently generate children according to the same
process, and so on ad infinitum. The processes consisting of an
immigrant and all its descendants are therefore branching processes, and
are independent of each other. Finally, the process resulting from the
superposition of all these branching processes is a Hawkes process of
conditional intensity λ(⋅)
(see Figure ).
This package also supports non-causal Hawkes processes, for which the
reproduction kernel h* may take non-negative
values on ℝ ≤ 0. Such
processes are not defined through their conditional intensity functions,
but through the poisson cluster representation, where each individual
can potentially generate offsprings both in the future and in the
past.
The count sequence of a Hawkes process is the time series generated
by the event counts of the process, that is the series obtained by
counting the events of the process on intervals of fixed length.
Formally, the count sequence with bin size Δ associated to the point process
N is the sequence (Xk)k ∈ ℤ.
Estimation procedure
This package fully supports the parametric estimation of Hawkes
processes from their count sequences via minimisation of the Whittle
likelihood, and partially supports the estimation from the arrival times
via maximisation of the likelihood.
Maximum likelihood estimation
The parameters of a Hawkes process with arrival times (Ti)1 ≤ i ≤ n
on the interval [0, T] can be
estimated by maximising the following pseudo-likelihood $$\mathcal L_n(\theta) = \left( \prod_{i=1}^n
\lambda(T_i) \right) \exp \left( - \int_0^T \lambda(s) \mathrm ds
\right).$$ Note that each step of the optimisation is usually of
complexity 𝒪(n2).
Minimisation of the Whittle likelihood
Alternatively, the parameters of a Hawkes process with count sequence
(Xk)1 ≤ k ≤ N
can be estimated by minimising the log-spectral (Whittle) likelihood
$$\mathcal L_N^\prime(\theta) =
\frac{1}{4\pi} \int_{-\pi}^\pi \left( \log f_\theta(\omega) +
\frac{I_N(\omega)}{f_\theta(\omega)} \right) \mathrm d\omega,$$
where $I_N(\omega) = (2\pi N)^{-1} \left|
\sum_{k=1}^N X_k \exp(-ik\omega) \right|^2$ denotes the
periodogram of (Xk) and fθ(⋅) its
spectral density function. Note that each step of the optimisation is of
complexity 𝒪(n) and the
periodogram can be calculated in complexity 𝒪(nlog n) using a Fast
Fourier Transform algorithm, making this method usually faster than
maximum likelihood estimation.
Spectral density function of the count sequence
Note that the spectral density function fθ of (Xk) can be
related to the spectral density f′θ of the
time-continuous count sequence (Xt)t ∈ ℝ
by taking into account spectral aliasing: fθ(ω) = ∑k ∈ ℤf′θ(ω + 2kπ).
In turn, f′θ can be
derived from the Bartlett spectrum of the Hawkes process (Daley and Vere-Jones 2003, example 8.2(e)) and
is given by $$f'_\theta(\omega) =
\frac{4\Delta}{\omega^2}\frac{\eta}{1 - \mu}
\sin^2\left(\frac{\omega}{2}\right) \left| 1 - \widetilde h\left(
\frac{\omega}{\Delta} \right) \right|^{-2},$$ where h̃ denotes the Fourier transform of
h: h̃(ω) = ∫ℝh(t)exp (−iωt)dt.
Main usage
This package supports both Hawkes processes and their count
sequences.
Simulation of Hawkes processes
Hawkes processes can be simulated by the function
hawkes
:
hawkes(end, fun, repr, family, M=null, ...)
where end
denotes the end of the interval [0, T]. This function uses the
cluster representation:
First, the immigrants are drawn according to a (potentially
inhomogeneous) Poisson process with intensity measure
fun
.
Second, the number of offsprings of an immigrant are drawn from a
Poisson distribution with intensity repr
.
Third, these offsprings are distributed according to the
family
distribution.
Then, further offsprings are generated according to the last two
steps.
The argument fun
can take a numeric value or be
specified as a function, in which case the argument M
must
be specified as an upper bound on fun
(to allow simulation
of the immigration process by thinning). The argument
family
can either be specified as a string
name
corresponding to a distribution with random generation
function rname
(for example exp
for
rexp
), or directly as a random generation function. The
optional arguments ...
are passed to the random generation
function specified by family
.
This returns a list of class hawkes
, whose realisations
are stored in the member p
. Other members of the list are
mainly used for support functions.
Examples
- Simulate a Hawkes process with immigration intensity η = 1, reproduction mean μ = 0.5 and reproduction kernel
h*(t) = 2e−2t1{t ≥ 0}
on [0, 10]:
x <- hawkes(10, fun = 1, repr = 0.5, family = "exp", rate = 2)
- Simulate a Hawkes process with inhomogeneous immigration process
with intensity η(t) = 1 + sin (t),
reproduction mean μ = 0.25 and
[0, 1]-triangular reproduction kernel
h*(t) = (1 − t)1{0 ≤ t ≤ 1}:
x <- hawkes(10, fun=function(y) {1+sin(y)}, M=2, repr=0.25,
family=function(n) {1 - sqrt(1 - runif(n))})
Plot function
- Hawkes processes can be plotted with the function
plot.hawkes
:
plot.hawkes(x, intensity = FALSE, precision = 1e3, fun = NULL, repr = NULL, family = NULL,
M = NULL, ...)
If x
is of class hawkes
, as for objects
returned by the function hawkes
, arguments fun
through M
can be ignored. If intensity
is set
to FALSE
, this plots the genealogy of the simulated Hawkes
process (as in Figure , bottom). If it is set to TRUE
, this
plots the conditional intensity of the process (as in Figure , top).
Estimation of Hawkes processes
Two functions implement the estimation of Hawkes processes:
mle
from arrival times (Ti) and
whittle
from count sequences (Xk). While the
optimisation procedure rely on existing functions (see below),
calculations of both the usual and Whittle likelihood functions are done
in C++ via Rcpp (Eddelbuettel and François
2011) and RcppArmadillo (Eddelbuettel and
Sanderson 2014).
By maximum likelihood
The maximum likelihood method is implemented by the function
mle(events, kern, end, init = NULL, opts = NULL, ...)
events
holds the arrival times (Ti) in
ascending order, kern
must be a string (partially) matching
one of the reproduction kernels (see below) and end
denotes
the endpoint T of observation
of the process.
The optimisation of the maximum likelihood function is done by the
function nloptr
from the package nloptr (Johnson, n.d.) with algorithm L-BFGS-B, where
the derivatives of the likelihood are calculated explicitly. By default,
parameters are constrained to be positive, and additionally μ is constrained to be below 1.
However, both the arguments opts
and ...
are
passed on to nloptr
, so the algorithm, the constraints, or
any other parameter of the optimisation procedure can be changed.
Example of use:
x = hawkes(100, fun = 1, repr = .5, family = "exp", rate = 1)
mle(x$p, "Exponential", x$end)
By minimisation of the Whittle likelihood
The Whittle likelihood method is implemented by the function
whittle(counts, kern, binsize = NULL, trunc = 5L, init = NULL, ...)
counts
holds the count sequence (Xk),
kern
must be a string (partially) matching one of the
reproduction kernels (see below), binsize
denotes the bin
size Δ and trunc
is the number of foldings due to aliasing taken into account.
The optimisation of the Whittle likelihood function is done by the
function optim
, with algorithm L-BFGS-B where the
derivatives of the likelihood are approximated by finite differences. By
default, parameters are constrained to be positive, and additionally
μ is constrained to be below
1. However, the argument ...
is passed to
optim
, so any optimisation parameter can be changed.
Example of use:
x = hawkes(1000, fun = 1, repr = .5, family = "exp", rate = 1)
y = discrete(x, binsize = 1)
whittle(y, "Exponential", 1)
Note that discrete
is a useful function to create the
count sequence (Xk) associated
with an object x
of class hawkes
.
Other functions
Other functions are all well documented and with examples. This
subsection will be expanded in the future.
Reproduction kernels
We introduce the reproduction kernels that are currently implemented
in this package. Recall that the Fourier transform of a reproduction
kernel is given by $$\widetilde
{h^\ast}(\omega) = \int_\mathbb R \exp(-i\omega t) h(t) \mathrm
dt,$$ and that it is a Hermitian function $\widetilde {h^\ast}(-\omega) = \overline{\widetilde
{h^\ast}(\omega)}$.
The exponential kernel
This is the exponential density function with parameter β > 0: h*(t) = βe−βt1{t > 0}.
Its Fourier transform is $$\widetilde
{h^\ast}(\omega) = \beta \frac{1}{\beta + i\omega} = \beta \frac{\beta -
i \omega}{\beta^2 + \omega^2}.$$
The exponential kernel can be specified with the string
Exponential
and the parameter β with the usual argument
rate
. Both maximum and Whittle likelihood methods are fully
implemented for exponential kernels. Moreover, the likelihood function
is implemented in complexity 𝒪(n), using the relations in (Ozaki and Ogata 1979).
The symmetric exponential kernel
This is a symmetrised version of the exponential density function
with parameter β > 0: $$h^\ast(t) = \frac 1 2 \beta e^{-\beta
|t|}.$$ Its Fourier transform is $$\widetilde {h^\ast}(\omega) =
\frac{\beta^2}{\beta^2 + \omega^2}.$$
The symmetric exponential kernel can be specified with the string
SymmetricExponential
and the parameter β with the argument
rate
. Only the Whittle likelihood method is implemented for
symmetric exponential kernels. Note that it is a non-causal kernel, as
h*(t) ≠ 0
for t < 0.
The gaussian kernel
This is the gaussian density function with mean ν ∈ ℝ and variance σ2 > 0: $$h^\ast(t) = \frac{1}{\sigma
\sqrt{2\pi}}\exp\left(-\frac{(t-\nu)^2}{2\sigma^2}\right).$$ Its
Fourier transform is $$\widetilde
{h^\ast}(\omega) =
\exp\left(-\frac{\sigma^2\omega^2}{2}-i\nu\omega\right).$$
The gaussian kernel can be specified with the string
Gaussian
and its parameters ν and σ with the usual arguments
mean
and sd
respectively. Only the Whittle
likelihood method is implemented for gaussian kernels. Note that it is a
non-causal kernel, as h*(t) ≠ 0 for
t < 0.
The power law kernel
This is a normalised and shifted power law function, with shape θ > 0 and scale a > 0: h*(t) = θaθ(t + a)−θ − 11{θ > 0}.
For positive ω, its Fourier
transform is given by $$\widetilde{h_\theta^\ast}(\omega) = \theta
\exp(i\omega a)E_{\theta + 1} (i\omega a),$$ where Eθ(ix)
denotes the integral Eθ(ix) = ∫1∞t−θexp (−ixt)dt.
With successive integration by parts, this integral can be related to
Eθ′(ix),
with 0 < θ′ ≤ 1.
If θ′ = 1 or equivalently
θ ∈ ℕ ≠ 0, the
integral E1(ix),
called the exponential integral with imaginary argument, can be related
the trigonometric integrals and calculated using Padé approximants (Rowe et al. 2015, Appendix B), accurate to
better than 10−16. The
function E1_imaginary
implements this approximation.
If θ′ < 1 or
equivalently θ ∈ ℝ \ ℕ, the
integral Eθ(ix)
can be related to the incomplete gamma function with imaginary argument
Γi(x, α) = ∫x∞tα − 1e−itdt,
where 0 < α < 1. We
implemented Taylor approximations of this integral, accurate to better
than 10−7, in the function
inc_gamma_imag
.
The power law kernel can be specified with the string
PowerLaw
and its parameters θ and a with the arguments
shape
and scale
respectively. Both maximum and
Whittle likelihood methods are implemented for power law kernels.
The Pareto kernels
This is the Pareto density function with shape θ > 0 and scale a > 0: hθ*(t) = θaθt−θ − 11{t > a}.
For positive ω, its Fourier
transform is given by $$\widetilde{h_\theta^\ast}(\omega) = \theta
E_{\theta + 1} (i\omega a),$$ see above for the definition of
Eθ(ix).
Only Pareto kernels with fixed θ = 1, 2, and 3
have been implemented and can specified with the strings
Pareto1
, Pareto2
and Pareto3
respectively, with parameter a
specified with the argument scale
. Only the Whittle method
is available for Pareto kernels.