Title: | Apply a PCA Like Procedure Suited for Multivariate Extreme Value Distributions |
---|---|
Description: | Dimension reduction for multivariate data of extreme events with a PCA like procedure as described in Reinbott, Janßen, (2024), <doi:10.48550/arXiv.2408.10650>. Tools for necessary transformations of the data are provided. |
Authors: | Felix Reinbott [aut, cre] |
Maintainer: | Felix Reinbott <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.1 |
Built: | 2024-10-08 06:20:33 UTC |
Source: | CRAN |
Turn the given data into a compressed latent representation given by the fit of the max_stable_prcomp function. This is done by taking the max-matrix product of the data and the encoder matrix from the fit.
compress(fit, data)
compress(fit, data)
fit |
max_stable_prcomp object. Data should be assumed to follow the same distribution as the data used in max_stable_prcomp. |
data |
array with same number of columns as the data of the fit object. |
An array of shape nrow(data), p giving the encoded representation of the data in p components which are also unit Frechet distributed which is to be takin into consideration for further analysis.
max_stable_prcomp()
, maxmatmul()
# generate some data with the desired margins dat <- matrix(evd::rfrechet(300), 100, 3) maxPCA <- max_stable_prcomp(dat, 2) # look at summary to obtain further information about # loadings the space spanned and loss function summary(maxPCA) # transfrom data to compressed representation # for a representation that is p-dimensional, # preserves the max-stable structure and is numeric solution to # optimal reconstruction. compr <- compress(maxPCA, dat) # For visual examination reconstruct original vector from compressed representation rec <- reconstruct(maxPCA, dat)
# generate some data with the desired margins dat <- matrix(evd::rfrechet(300), 100, 3) maxPCA <- max_stable_prcomp(dat, 2) # look at summary to obtain further information about # loadings the space spanned and loss function summary(maxPCA) # transfrom data to compressed representation # for a representation that is p-dimensional, # preserves the max-stable structure and is numeric solution to # optimal reconstruction. compr <- compress(maxPCA, dat) # For visual examination reconstruct original vector from compressed representation rec <- reconstruct(maxPCA, dat)
Find an optimal encoding of data of extremes using max-linear combinations by a distance minimization approach. Can be used to check if the data follows approximately a generalized max-linear model. For details on the statistical procedure it is advised to consult the articles "F. Reinbott, A. Janßen, Principal component analysis for max-stable distributions (https://arxiv.org/abs/2408.10650)" and "M.Schlather F. Reinbott, A semi-group approach to Principal Component Analysis (https://arxiv.org/abs/2112.04026)".
max_stable_prcomp(data, p, s = 3, n_initial_guesses = 150, norm = "l1", ...)
max_stable_prcomp(data, p, s = 3, n_initial_guesses = 150, norm = "l1", ...)
data |
array or data.frame of n observations of d variables with unit Frechet margins. The max-stable PCA is fitted to reconstruct this dataset with a rank p approximation. |
p |
integer between 1 and ncol(data). Determines the dimension of the encoded state, i.e. the number of max-linear combinations in the compressed representation. |
s |
(default = 3), numeric greater than 0. Hyperparameter for the stable tail dependence estimator used in tn the calculation. |
n_initial_guesses |
number of guesses to choose a valid initial value for optimization from. This procedure uses a pseudo random number generator so setting a seed is necessary for reproducibility. |
norm |
(delfault "l1") which norm to use for the spectral measure estimator, currently only l1 and sup norm "linfty" are available. |
... |
additional parameters passed to |
object of class max_stable_prcomp with slots p, inserted value of dimension, decoder_matrix, an array of shape (d,p), where the columns represent the basis of the max-linear space for the reconstruction. encoder_matrix, an array of shape (p,d), where the rows represent the loadings as max-linear combinations for the compressed representation. reconstr_matrix, an array of shape (d,d), where the matrix is the mapping of the data to the reconstruction used for the distance minimization. loss_fctn_value, float representing the final loss function value of the fit. optim_conv_status, integer indicating the convergence of the optimizer if greater than 0.
# generate some data with the desired margins dat <- matrix(evd::rfrechet(300), 100, 3) maxPCA <- max_stable_prcomp(dat, 2) # look at summary to obtain further information about # loadings the space spanned and loss function summary(maxPCA) # transfrom data to compressed representation # for a representation that is p-dimensional, # preserves the max-stable structure and is numeric solution to # optimal reconstruction. compr <- compress(maxPCA, dat) # For visual examination reconstruct original vector from compressed representation rec <- reconstruct(maxPCA, dat)
# generate some data with the desired margins dat <- matrix(evd::rfrechet(300), 100, 3) maxPCA <- max_stable_prcomp(dat, 2) # look at summary to obtain further information about # loadings the space spanned and loss function summary(maxPCA) # transfrom data to compressed representation # for a representation that is p-dimensional, # preserves the max-stable structure and is numeric solution to # optimal reconstruction. compr <- compress(maxPCA, dat) # For visual examination reconstruct original vector from compressed representation rec <- reconstruct(maxPCA, dat)
By calculating the entries with
for appropriate dimensions. Note that this operation is particularly useful when working with multivariate exreme value distributions, because, if the margins are standardized to standard Fréchet margins, then the max-matrix product of a matrix A and a multivariate extreme value distribution Z with standard Fréchet margins has the same margins up to scaling.
maxmatmul(A, B)
maxmatmul(A, B)
A |
a non-negative array of dim n, k |
B |
a non-negative array of dim k, l |
A non netgative array of dim n, l. The entries are given by the maximum of componentwise multiplication of rows from A and columns from B.
# Set up example matrices A <- matrix(c(1,2,3,4,5,6), 2, 3) B <- matrix(c(1,2,1,2,1,2), 3, 2) # calling the function m1 <- maxmatmul(A, B) # can be used for matrix-vector multiplication as well v <- c(7,4,7) m2 <- maxmatmul(A, v) m3 <- maxmatmul(v,v)
# Set up example matrices A <- matrix(c(1,2,3,4,5,6), 2, 3) B <- matrix(c(1,2,1,2,1,2), 3, 2) # calling the function m1 <- maxmatmul(A, B) # can be used for matrix-vector multiplication as well v <- c(7,4,7) m2 <- maxmatmul(A, v) m3 <- maxmatmul(v,v)
Map the data to the reconstruction given by the fit of the max_stable_prcomp function. This is done by taking the max-matrix product of the data and the reconstruction matrix from the fit.
reconstruct(fit, data)
reconstruct(fit, data)
fit |
max_stable_prcomp object. Data should be assumed to follow the same distribution as the data used in max_stable_prcomp. |
data |
array with same number of columns as the data of the fit object. |
An array of shape nrow(data), p giving the encoded representation of the data in p components which are also unit Frechet distributed which is to be takin into consideration for further analysis.
max_stable_prcomp()
, maxmatmul()
# generate some data with the desired margins dat <- matrix(evd::rfrechet(300), 100, 3) maxPCA <- max_stable_prcomp(dat, 2) # look at summary to obtain further information about # loadings the space spanned and loss function summary(maxPCA) # transfrom data to compressed representation # for a representation that is p-dimensional, # preserves the max-stable structure and is numeric solution to # optimal reconstruction. compr <- compress(maxPCA, dat) # For visual examination reconstruct original vector from compressed representation rec <- reconstruct(maxPCA, dat)
# generate some data with the desired margins dat <- matrix(evd::rfrechet(300), 100, 3) maxPCA <- max_stable_prcomp(dat, 2) # look at summary to obtain further information about # loadings the space spanned and loss function summary(maxPCA) # transfrom data to compressed representation # for a representation that is p-dimensional, # preserves the max-stable structure and is numeric solution to # optimal reconstruction. compr <- compress(maxPCA, dat) # For visual examination reconstruct original vector from compressed representation rec <- reconstruct(maxPCA, dat)
Print summary of a max_stable_prcomp object.
## S3 method for class 'max_stable_prcomp' summary(object, ...)
## S3 method for class 'max_stable_prcomp' summary(object, ...)
object |
max_stable_prcomp object. Data should be assumed to follow the same distribution as the data used in max_stable_prcomp. |
... |
additional unused arguments. |
Same as base::print()
.
Since the dataset is intended to be transformed for PCA, this function takes a dataset transformed_data and transforms the margins to the marginal distribution of the dataset orig_data.
transform_orig_margins(transformed_data, orig_data)
transform_orig_margins(transformed_data, orig_data)
transformed_data |
arraylike data of dimension n, d |
orig_data |
arraylike data of dimension n , d |
array of dimension n,d with transformed columns of transformed_data that follow approximately the same marginal distribution of orig_data.
max_stable_prcomp()
, transform_unitfrechet()
, [mev::fit.gev())] for information about why to transform data
[mev::fit.gev())]: R:mev::fit.gev())
# create a sample dat <- rnorm(1000) transformed_dat <- transform_unitpareto(dat)
# create a sample dat <- rnorm(1000) transformed_dat <- transform_unitpareto(dat)
Transforms columns of dataset to unit Frechet margins, to ensure
the theoretical requirements are satisfied for the application of
max_stable_prcomp
using the empirical distribution function.
transform_unitfrechet(data)
transform_unitfrechet(data)
data |
array or vector with the data which columns are to be transformed |
array or vector of same shape and type as data with the transformed data with unit Frechet margins-
max_stable_prcomp()
, transform_orig_margins()
, [mev::fit.gev())] for information about why to transform data.
[mev::fit.gev())]: R:mev::fit.gev())
# sample some data dat <- rnorm(1000) transformed_dat <- transform_unitfrechet(dat) # Look at a plot of distribution boxplot(transformed_dat) plot(stats::ecdf(transformed_dat))
# sample some data dat <- rnorm(1000) transformed_dat <- transform_unitfrechet(dat) # Look at a plot of distribution boxplot(transformed_dat) plot(stats::ecdf(transformed_dat))
Transforms columns of dataset to unit Pareto margins, to ensure
the theoretical requirements are satisfied for the application of
max_stable_prcomp
using the empirical distribution function.
transform_unitpareto(data)
transform_unitpareto(data)
data |
array or vector with the data which columns are to be transformed |
array or vector of same shape and type as data with the transformed data with unit Frechet margins-
max_stable_prcomp()
, transform_orig_margins()
, [mev::fit.gev())] for information about why to transform data.
[mev::fit.gev())]: R:mev::fit.gev())
# sample some data dat <- rnorm(1000) transformed_dat <- transform_unitfrechet(dat) # Look at a plot of distribution boxplot(transformed_dat) plot(stats::ecdf(transformed_dat))
# sample some data dat <- rnorm(1000) transformed_dat <- transform_unitfrechet(dat) # Look at a plot of distribution boxplot(transformed_dat) plot(stats::ecdf(transformed_dat))