| Title: | Apply a PCA Like Procedure Suited for Multivariate Extreme Value Distributions |
|---|---|
| Description: | Dimension reduction for multivariate data of extreme events with a PCA like procedure as described in Reinbott, Janßen, (2024), <doi:10.48550/arXiv.2408.10650>. Tools for necessary transformations of the data are provided. |
| Authors: | Felix Reinbott [aut, cre] |
| Maintainer: | Felix Reinbott <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.2 |
| Built: | 2026-05-24 08:35:28 UTC |
| Source: | https://github.com/cran/maxstablePCA |
Turn the given data into a compressed latent representation given by the fit of the max_stable_prcomp function. This is done by taking the max-matrix product of the data and the encoder matrix from the fit.
compress(fit, data)compress(fit, data)
fit |
max_stable_prcomp object. Data should be assumed to follow the same distribution as the data used in max_stable_prcomp. |
data |
array with same number of columns as the data of the fit object. |
An array of shape nrow(data), p giving the encoded representation of the data in p components which are also unit Frechet distributed which is to be takin into consideration for further analysis.
max_stable_prcomp(), maxmatmul()
# generate some data with the desired margins dat <- matrix(evd::rfrechet(300), 100, 3) maxPCA <- max_stable_prcomp(dat, 2) # look at summary to obtain further information about # loadings the space spanned and loss function summary(maxPCA) # transfrom data to compressed representation # for a representation that is p-dimensional, # preserves the max-stable structure and is numeric solution to # optimal reconstruction. compr <- compress(maxPCA, dat) # For visual examination reconstruct original vector from compressed representation rec <- reconstruct(maxPCA, dat)# generate some data with the desired margins dat <- matrix(evd::rfrechet(300), 100, 3) maxPCA <- max_stable_prcomp(dat, 2) # look at summary to obtain further information about # loadings the space spanned and loss function summary(maxPCA) # transfrom data to compressed representation # for a representation that is p-dimensional, # preserves the max-stable structure and is numeric solution to # optimal reconstruction. compr <- compress(maxPCA, dat) # For visual examination reconstruct original vector from compressed representation rec <- reconstruct(maxPCA, dat)
Measurements and geographical information about daily average river discharges in (m^3/s) at 13 measurement stations from the Elbe river network from 31.12.1988 to 30.12.2010 for the train data and from 01.01.2010 to 31.12.2020 for the test data.
data(elbe)data(elbe)
A named list containing differnent data files
A list containing the date of the measurement and measurements of the raw discharge data as data.frame at the 13 stations,
and a data.frame containing the maximal discharge between the date "from" and "to".
The blockmax dataset only considers the maximal value for the summer months June to September to reduce seasonal trends and temporal dependence.
Same structure as the two train data.frame objects, but only contains data from 01.01.2011 to 31.12.2020.
A data.frame object containing the station name, approximate latitude and longitude of the measurement station, the river measured and the next downstream station
Datenportal der FGG Elbe https://www.elbe-datenportal.de
Find an optimal encoding of data of extremes using max-linear combinations by a distance minimization approach. Can be used to check if the data follows approximately a generalized max-linear model. For details on the statistical procedure it is advised to consult the articles "F. Reinbott, A. Janßen, Principal component analysis for max-stable distributions (https://arxiv.org/abs/2408.10650)" and "M.Schlather F. Reinbott, A semi-group approach to Principal Component Analysis (https://arxiv.org/abs/2112.04026)".
max_stable_prcomp( data, p, s = 3, n_initial_guesses = 150, norm = "l1", optim_style = "full", ... )max_stable_prcomp( data, p, s = 3, n_initial_guesses = 150, norm = "l1", optim_style = "full", ... )
data |
array or data.frame of n observations of d variables with unit Frechet margins. The max-stable PCA is fitted to reconstruct this dataset with a rank p approximation. |
p |
integer between 1 and ncol(data). Determines the dimension of the encoded state, i.e. the number of max-linear combinations in the compressed representation. |
s |
(default = 3), numeric greater than 0. Hyperparameter for the |
n_initial_guesses |
number of guesses to choose a valid initial value for optimization from. This procedure uses a pseudo random number generator so setting a seed is necessary for reproducibility. stable tail dependence estimator used in tn the calculation. |
norm |
(delfault "l1") which norm to use for the spectral measure estimator, currently only l1 and sup norm "linfty" are available. |
optim_style |
(delfault "full") choose between two different optimization strategies. The default being "full" that optimizes both matrices simultaneously. the other choice "alternating" fixes one matrix then optimizes the other matrix until converged, then optimizes the other matrix in the same style. This can lead to more accurate results in some cases. |
... |
additional parameters passed to |
object of class max_stable_prcomp with slots p, inserted value of dimension, decoder_matrix, an array of shape (d,p), where the columns represent the basis of the max-linear space for the reconstruction. encoder_matrix, an array of shape (p,d), where the rows represent the loadings as max-linear combinations for the compressed representation. reconstr_matrix, an array of shape (d,d), where the matrix is the mapping of the data to the reconstruction used for the distance minimization. loss_fctn_value, float representing the final loss function value of the fit. optim_conv_status, integer indicating the convergence of the optimizer if greater than 0.
# generate some data with the desired margins dat <- matrix(evd::rfrechet(300), 100, 3) maxPCA <- max_stable_prcomp(dat, 2) # look at summary to obtain further information about # loadings the space spanned and loss function summary(maxPCA) # transfrom data to compressed representation # for a representation that is p-dimensional, # preserves the max-stable structure and is numeric solution to # optimal reconstruction. compr <- compress(maxPCA, dat) # For visual examination reconstruct original vector from compressed representation rec <- reconstruct(maxPCA, dat)# generate some data with the desired margins dat <- matrix(evd::rfrechet(300), 100, 3) maxPCA <- max_stable_prcomp(dat, 2) # look at summary to obtain further information about # loadings the space spanned and loss function summary(maxPCA) # transfrom data to compressed representation # for a representation that is p-dimensional, # preserves the max-stable structure and is numeric solution to # optimal reconstruction. compr <- compress(maxPCA, dat) # For visual examination reconstruct original vector from compressed representation rec <- reconstruct(maxPCA, dat)
By calculating the entries with
for appropriate dimensions. Note that this operation is particularly useful when working with multivariate exreme value distributions, because, if the margins are standardized to standard Fréchet margins, then the max-matrix product of a matrix A and a multivariate extreme value distribution Z with standard Fréchet margins has the same margins up to scaling.
maxmatmul(A, B)maxmatmul(A, B)
A |
a non-negative array of dim n, k |
B |
a non-negative array of dim k, l |
A non netgative array of dim n, l. The entries are given by the maximum of componentwise multiplication of rows from A and columns from B.
# Set up example matrices A <- matrix(c(1,2,3,4,5,6), 2, 3) B <- matrix(c(1,2,1,2,1,2), 3, 2) # calling the function m1 <- maxmatmul(A, B) # can be used for matrix-vector multiplication as well v <- c(7,4,7) m2 <- maxmatmul(A, v) m3 <- maxmatmul(v,v)# Set up example matrices A <- matrix(c(1,2,3,4,5,6), 2, 3) B <- matrix(c(1,2,1,2,1,2), 3, 2) # calling the function m1 <- maxmatmul(A, B) # can be used for matrix-vector multiplication as well v <- c(7,4,7) m2 <- maxmatmul(A, v) m3 <- maxmatmul(v,v)
Map the data to the reconstruction given by the fit of the max_stable_prcomp function. This is done by taking the max-matrix product of the data and the reconstruction matrix from the fit.
reconstruct(fit, data)reconstruct(fit, data)
fit |
max_stable_prcomp object. Data should be assumed to follow the same distribution as the data used in max_stable_prcomp. |
data |
array with same number of columns as the data of the fit object. |
An array of shape nrow(data), p giving the encoded representation of the data in p components which are also unit Frechet distributed which is to be takin into consideration for further analysis.
max_stable_prcomp(), maxmatmul()
# generate some data with the desired margins dat <- matrix(evd::rfrechet(300), 100, 3) maxPCA <- max_stable_prcomp(dat, 2) # look at summary to obtain further information about # loadings the space spanned and loss function summary(maxPCA) # transfrom data to compressed representation # for a representation that is p-dimensional, # preserves the max-stable structure and is numeric solution to # optimal reconstruction. compr <- compress(maxPCA, dat) # For visual examination reconstruct original vector from compressed representation rec <- reconstruct(maxPCA, dat)# generate some data with the desired margins dat <- matrix(evd::rfrechet(300), 100, 3) maxPCA <- max_stable_prcomp(dat, 2) # look at summary to obtain further information about # loadings the space spanned and loss function summary(maxPCA) # transfrom data to compressed representation # for a representation that is p-dimensional, # preserves the max-stable structure and is numeric solution to # optimal reconstruction. compr <- compress(maxPCA, dat) # For visual examination reconstruct original vector from compressed representation rec <- reconstruct(maxPCA, dat)
Print summary of a max_stable_prcomp object.
## S3 method for class 'max_stable_prcomp' summary(object, ...)## S3 method for class 'max_stable_prcomp' summary(object, ...)
object |
max_stable_prcomp object. Data should be assumed to follow the same distribution as the data used in max_stable_prcomp. |
... |
additional unused arguments. |
Same as base::print().
Since the dataset is intended to be transformed for PCA, this function takes a dataset transformed_data and transforms the margins to the marginal distribution of the dataset orig_data.
transform_orig_margins(transformed_data, orig_data)transform_orig_margins(transformed_data, orig_data)
transformed_data |
arraylike data of dimension n, d |
orig_data |
arraylike data of dimension n , d |
array of dimension n,d with transformed columns of transformed_data that follow approximately the same marginal distribution of orig_data.
max_stable_prcomp(), transform_unitfrechet(), [mev::fit.gev())] for information about why to transform data
[mev::fit.gev())]: R:mev::fit.gev())
# create a sample dat <- rnorm(1000) transformed_dat <- transform_unitpareto(dat)# create a sample dat <- rnorm(1000) transformed_dat <- transform_unitpareto(dat)
Transforms columns of dataset to unit Frechet margins, to ensure
the theoretical requirements are satisfied for the application of
max_stable_prcomp using the empirical distribution function.
transform_unitfrechet(data)transform_unitfrechet(data)
data |
array or vector with the data which columns are to be transformed |
array or vector of same shape and type as data with the transformed data with unit Frechet margins-
max_stable_prcomp(), transform_orig_margins(), [mev::fit.gev())] for information about why to transform data.
[mev::fit.gev())]: R:mev::fit.gev())
# sample some data dat <- rnorm(1000) transformed_dat <- transform_unitfrechet(dat) # Look at a plot of distribution boxplot(transformed_dat) plot(stats::ecdf(transformed_dat))# sample some data dat <- rnorm(1000) transformed_dat <- transform_unitfrechet(dat) # Look at a plot of distribution boxplot(transformed_dat) plot(stats::ecdf(transformed_dat))
Transforms columns of dataset to unit Pareto margins, to ensure
the theoretical requirements are satisfied for the application of
max_stable_prcomp using the empirical distribution function.
transform_unitpareto(data)transform_unitpareto(data)
data |
array or vector with the data which columns are to be transformed |
array or vector of same shape and type as data with the transformed data with unit Frechet margins-
max_stable_prcomp(), transform_orig_margins(), [mev::fit.gev())] for information about why to transform data.
[mev::fit.gev())]: R:mev::fit.gev())
# sample some data dat <- rnorm(1000) transformed_dat <- transform_unitfrechet(dat) # Look at a plot of distribution boxplot(transformed_dat) plot(stats::ecdf(transformed_dat))# sample some data dat <- rnorm(1000) transformed_dat <- transform_unitfrechet(dat) # Look at a plot of distribution boxplot(transformed_dat) plot(stats::ecdf(transformed_dat))