| Title: | Detect Sensitive Points in the Tail |
|---|---|
| Description: | The goal of 'TailID' is to detect sensitive points in the tail of a dataset using techniques from Extreme Value Theory (EVT). It utilizes the Generalized Pareto Distribution (GPD) for assessing tail behavior and detecting inconsistent points with the Identical Distribution hypothesis of the tail. For more details see Manau (2025)<doi:10.4230/LIPIcs.ECRTS.2025.20>. |
| Authors: | Blau Manau [aut, cre] (ORCID: <https://orcid.org/0009-0007-7227-4448>) |
| Maintainer: | Blau Manau <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 1.0.0 |
| Built: | 2026-05-12 07:13:59 UTC |
| Source: | https://github.com/cran/TailID |
This function selects the candidates of the tail that can be inconsistent to the ID hypothesis
candidate_selection(sample, pc_max, pc_min)candidate_selection(sample, pc_max, pc_min)
sample |
A numeric vector. |
pc_max |
A number between pm_max and 1 indicating the threshold of maximum sensitive points to consider. |
pc_min |
A number between pm_min and 1 indicating the threshold of minimum sensitive points to consider. |
A vector of indices corresponding to the detected sensitive points.
candidate_selection(rnorm(1000), 0.99, 0.99) candidate_selection(c(rnorm(10^3,10,1),rnorm(10,20,3)), 0.9, 0.9)candidate_selection(rnorm(1000), 0.99, 0.99) candidate_selection(c(rnorm(10^3,10,1),rnorm(10,20,3)), 0.9, 0.9)
This function computes a confidence interval for a GPD shape.
CI_shapeGPD(sample, threshold, parameter, conf_level)CI_shapeGPD(sample, threshold, parameter, conf_level)
sample |
A numeric vector. |
threshold |
A number between 0 and 1 indicating the threshold of extreme values to consider. |
parameter |
A number indicating the shape value. |
conf_level |
A number between 0 and 1 indicating the confidence level for the detection. |
A Confidence Interval vector.
CI_shapeGPD(rnorm(1000), 0.8, 1, 0.95) CI_shapeGPD(c(rnorm(10^3,10,1),rnorm(10,20,3)), 0.8, 12, 0.9999)CI_shapeGPD(rnorm(1000), 0.8, 1, 0.95) CI_shapeGPD(c(rnorm(10^3,10,1),rnorm(10,20,3)), 0.8, 12, 0.9999)
This function saves the plots corresponding to the TailID detection, which includes: targeted candidates plot, shape variation plot, and inconsistent detected points.
plot_TailID(output_dir, sample, pm_max, pm_min, pc_max, pc_min, conf_level)plot_TailID(output_dir, sample, pm_max, pm_min, pc_max, pc_min, conf_level)
output_dir |
Path to save the plots. |
sample |
A numeric vector. |
pm_max |
A number between 0 and 1 indicating the threshold of maximum extreme values to consider. |
pm_min |
A number between 0 and 1 indicating the threshold of minimum extreme values to consider. |
pc_max |
A number between pm_max and 1 indicating the threshold of maximum sensitive points to consider. |
pc_min |
A number between pm_min and 1 indicating the threshold of minimum sensitive points to consider. |
conf_level |
A number between 0 and 1 indicating the confidence level for the detection. |
A vector of indices corresponding to the detected sensitive points.
output_dir <- file.path(tempdir(), "output") if (dir.exists(output_dir) || dir.create(output_dir, recursive = TRUE)) { plot_TailID(output_dir, rnorm(1000), 0.85, 0.85, 0.999, 0.999, 0.95) } if (dir.exists(output_dir) || dir.create(output_dir, recursive = TRUE)) { plot_TailID(output_dir, c(rnorm(10^3, 10, 1), rnorm(10, 20, 3)), 0.85, 0.85, 0.99, 0.99, 0.99999) }output_dir <- file.path(tempdir(), "output") if (dir.exists(output_dir) || dir.create(output_dir, recursive = TRUE)) { plot_TailID(output_dir, rnorm(1000), 0.85, 0.85, 0.999, 0.999, 0.95) } if (dir.exists(output_dir) || dir.create(output_dir, recursive = TRUE)) { plot_TailID(output_dir, c(rnorm(10^3, 10, 1), rnorm(10, 20, 3)), 0.85, 0.85, 0.99, 0.99, 0.99999) }
This function detects the points of the tail that are inconsistent with the ID hypothesis by evaluation the shape variation of the GPD, and also returns the shape parameters computed and its confidence intervals
shape_evaluation(sample, candidates, pm_max, pm_min, conf_level)shape_evaluation(sample, candidates, pm_max, pm_min, conf_level)
sample |
A numeric vector. |
candidates |
A list of indices of the sample. |
pm_max |
A number between 0 and 1 indicating the threshold of maximum extreme values to consider. |
pm_min |
A number between 0 and 1 indicating the threshold of maximum extreme values to consider. |
conf_level |
A number between 0 and 1 indicating the confidence level for the detection. |
A vector of indices corresponding to the detected sensitive points.
shape_evaluation(rnorm(1000),candidate_selection(rnorm(1000), 0.99, 0.99), 0.8, 0.8, 0.95) shape_evaluation(c(rnorm(10^3,10,1),rnorm(10,20,3)), candidate_selection(c(rnorm(10^3,10,1),rnorm(10,20,3)), 0.9, 0.9), 0.8, 0.8, 0.9999)shape_evaluation(rnorm(1000),candidate_selection(rnorm(1000), 0.99, 0.99), 0.8, 0.8, 0.95) shape_evaluation(c(rnorm(10^3,10,1),rnorm(10,20,3)), candidate_selection(c(rnorm(10^3,10,1),rnorm(10,20,3)), 0.9, 0.9), 0.8, 0.8, 0.9999)
This function returns the points of the tail that are inconsistent with the ID hypothesis.
TailID(sample, pm_max, pm_min, pc_max, pc_min, conf_level)TailID(sample, pm_max, pm_min, pc_max, pc_min, conf_level)
sample |
A numeric vector. |
pm_max |
A number between 0 and 1 indicating the threshold of maximum extreme values to consider. |
pm_min |
A number between 0 and 1 indicating the threshold of minimum extreme values to consider. |
pc_max |
A number between pm_max and 1 indicating the threshold of maximum sensitive points to consider. |
pc_min |
A number between pm_min and 1 indicating the threshold of minimum sensitive points to consider. |
conf_level |
A number between 0 and 1 indicating the confidence level for the detection. |
A vector of indices corresponding to the detected sensitive points.
TailID(rnorm(1000), 0.8, 0.8, 0.99, 0.99, 0.95) TailID(c(rnorm(10^3,10,1),rnorm(10,20,3)), 0.8, 0.8, 0.9, 0.9, 0.9999)TailID(rnorm(1000), 0.8, 0.8, 0.99, 0.99, 0.95) TailID(c(rnorm(10^3,10,1),rnorm(10,20,3)), 0.8, 0.8, 0.9, 0.9, 0.9999)