Title: | An Implementation of the Typicality and Eccentricity Data Analysis Framework |
---|---|
Description: | The typicality and eccentricity data analysis (TEDA) framework was put forward by Angelov (2013) <DOI:10.14313/JAMRIS_2-2014/16>. It has been further developed into multiple different techniques since, and provides a non-parametric way of determining how similar an observation, from a process that is not purely random, is to other observations generated by the process. This package provides code to use the batch and recursive TEDA methods that have been published. |
Authors: | David Ciar [cre, aut], James Wright [aut] |
Maintainer: | David Ciar <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.1.1 |
Built: | 2024-10-31 21:24:38 UTC |
Source: | CRAN |
Takes a tedab object and plots each metric individually
## S3 method for class 'tedab' plot(x, ...)
## S3 method for class 'tedab' plot(x, ...)
x |
The teda batch (tedab) object with which to create the plot output. |
... |
additional arguments affecting the summary produced. |
Takes a tedab object and creates four plots in order of: eccentricity, typicality, normalised eccentricity, and normalised typicality.
Takes a tedab object and prints out the values within
## S3 method for class 'tedab' print(x, ...)
## S3 method for class 'tedab' print(x, ...)
x |
The teda batch (tedab) object with which to create the printed output. |
... |
additional arguments affecting the summary produced. |
Takes a tedab object and prints out each vector in order of: eccentricity, typicality, normalised eccentricity, and normalised typicality.
#' @description Takes a tedar object and prints out the object values.
## S3 method for class 'tedar' print(x, ...)
## S3 method for class 'tedar' print(x, ...)
x |
The teda recursive (tedar) object with which to create the print output. |
... |
additional arguments affecting the summary produced. |
Takes a tedar object and prints out the values within (currently the same as summarize).
Summarises the teda batch object using an S3 method
## S3 method for class 'tedab' summary(object, ...)
## S3 method for class 'tedab' summary(object, ...)
object |
The teda batch (tedab) object with which to create the summary output. |
... |
additional arguments affecting the summary produced. |
Takes a tedab object and prints out the following summary details:
the number of observations
the number of observations that exceed the normalised eccentricity limit
the normalised eccentricity threshold
Takes a tedar object and prints out the summary values.
## S3 method for class 'tedar' summary(object, ...)
## S3 method for class 'tedar' summary(object, ...)
object |
The teda recursive (tedar) object with which to create the summary output. |
... |
additional arguments affecting the summary produced. |
Takes a tedar object and prints out the summary values.
The package provides functions to calculate both the batch and recursive typicality and eccentricity values of given observations.
TEDA provides a non-parametric technique to determine how eccentric/typical an observation is with respect to the other observations generated by the same process. Available as either a batch function working over a whole dataset, or as a recursive one-time-pass function that needs the current mean and variance values to be passed as arguments.
Both batch and recursive methods return a datatype (tedab or tedar) which provide print and summary generic function implementations. The batch object also provides a generic plot function.
Further work will implement more of the analytical framework built up around TEDA, such as clustering algorithms.
Angelov, P., 2014. Outside the box: an alternative data analytics framework. Journal of Automation Mobile Robotics and Intelligent Systems, 8(2), pp.29-35. DOI: 10.14313/JAMRIS_2-2014/16
Bezerra, C.G., Costa, B.S.J., Guedes, L.A. and Angelov, P.P., 2016, May. A new evolving clustering algorithm for online data streams. In Evolving and Adaptive Intelligent Systems (EAIS), 2016 IEEE Conference on (pp. 162-168). IEEE. DOI: 10.1109/EAIS.2016.7502508
Takes a vector of observations and return a teda batch object, which holds the eccentricity and typicality values, both original and normalised versions.
teda_b(observations, dist_type = "Euclidean")
teda_b(observations, dist_type = "Euclidean")
observations |
A vector of numeric observations |
dist_type |
A string representing the distance metric to use, default value (and currently only supported value) is "Euclidean" |
Uses the algorithm from Angelov (2014) to create a teda batch object. This contains a vector for the eccentricity (standard and normalised), typicality (standard and normalised), the outlier threshold, and whether each observation is or is not an outlier. Also provides the original vector of values.
The teda batch object
Angelov, P., 2014. Outside the box: an alternative data analytics framework. Journal of Automation Mobile Robotics and Intelligent Systems, 8(2), pp.29-35. DOI: 10.14313/JAMRIS_2-2014/16
teda_r
for the recursive version of the TEDA framework.
Other TEDA.functions: teda_r
vec = c(20, 12, 10) teda_b(vec) # same as a = teda_b(vec,"Euclidean") summary(a) plot(a)
vec = c(20, 12, 10) teda_b(vec) # same as a = teda_b(vec,"Euclidean") summary(a) plot(a)
A recursive method that takes the state variables of previous mean, previous variance, and the current timestep position, along with the current observation. It returns a teda recursive object. Currently only a univariate implementation.
teda_r(curr_observation, previous_mean = curr_observation, previous_var = 0, k = 1, dist_type = "Euclidean")
teda_r(curr_observation, previous_mean = curr_observation, previous_var = 0, k = 1, dist_type = "Euclidean")
curr_observation |
A single observation, the most recent in a series |
previous_mean |
The mean value returned by the previous call to this function, if no previous calls, default value is used. |
previous_var |
The variance value returned by the previous call to this function, if no previous calls, default value is used. |
k |
The count of observations processed by the recursive function, including the current observation |
dist_type |
A string representing the distance metric to use, default value (and currently only supported value) is "Euclidean" |
The function has two intended ways of use: on the first pass, it only takes the observation value as a paramter and the rest are provided by defaults, on all other passes, it takes the current observation, the previous mean and variance values, and the current k (number of observations) which includes the current observation.
On return, the teda recursive object holds:
the current observation
the current mean
the current variance
the current observation's eccentricity
the current observation's typicality
the current observation's normalised eccentricity
the current observation's normalised typicality
whether the current observation is an outlier
the current outlier threshold
the next timestep value, k+1
It provides generic functions for print and summary, at this moment both provide the same outout.
The teda recursive object
Bezerra, C.G., Costa, B.S.J., Guedes, L.A. and Angelov, P.P., 2016, May. A new evolving clustering algorithm for online data streams. In Evolving and Adaptive Intelligent Systems (EAIS), 2016 IEEE Conference on (pp. 162-168). IEEE. DOI: 10.1109/EAIS.2016.7502508
Other TEDA.functions: teda_b
vec = c(20, 12, 10, 20) a = teda_r(vec[1]) b = teda_r(vec[2], a$curr_mean, a$curr_var, a$next_k) c = teda_r(vec[3], b$curr_mean, b$curr_var, b$next_k) d = teda_r(vec[4], c$curr_mean, c$curr_var, c$next_k) summary(d)
vec = c(20, 12, 10, 20) a = teda_r(vec[1]) b = teda_r(vec[2], a$curr_mean, a$curr_var, a$next_k) c = teda_r(vec[3], b$curr_mean, b$curr_var, b$next_k) d = teda_r(vec[4], c$curr_mean, c$curr_var, c$next_k) summary(d)