Package 'DTWBI'

Title: Imputation of Time Series Based on Dynamic Time Warping
Description: Functions to impute large gaps within time series based on Dynamic Time Warping methods. It contains all required functions to create large missing consecutive values within time series and to fill them, according to the paper Phan et al. (2017), <DOI:10.1016/j.patrec.2017.08.019>. Performance criteria are added to compare similarity between two signals (query and reference).
Authors: Camille Dezecache, T. T. Hong Phan, Emilie Poisson-Caillault
Maintainer: Emilie Poisson-Caillault <[email protected]>
License: GPL (>= 2)
Version: 1.1
Built: 2024-12-07 06:36:39 UTC
Source: CRAN

Help Index


Imputation of Time Series Based on Dynamic Time Warping

Description

Functions to impute large gaps within time series based on Dynamic Time Warping methods. It contains all required functions to create large missing consecutive values within time series and to fill them, according to the paper Phan et al. (2017), <DOI:10.1016/j.patrec.2017.08.019>. Performance criteria are added to compare similarity between two signals (query and reference).

Details

Index of help topics:

DTWBI-package           Imputation of Time Series Based on Dynamic Time
                        Warping
DTWBI_univariate        DTWBI algorithm for univariate signals
compute.fa2             FA2
compute.fb              Fractional Bias (FB)
compute.fsd             Fraction of Standard Deviation (FSD)
compute.nmae            Normalized Mean Absolute Error (NMAE)
compute.rmse            Root Mean Square Error (RMSE)
compute.sim             Similarity
dataDTWBI               Six univariate signals as example for DTWBI
                        package
dist_afbdtw             Adaptive Feature Based Dynamic Time Warping
                        algorithm
gapCreation             Gap creation
local.derivative.ddtw   Local derivative estimate to compute DDTW
minCost                 DTW-based methods for univariate signals

Author(s)

Camille Dezecache, T. T. Hong Phan, Emilie Poisson-Caillault

Maintainer: Emilie Poisson-Caillault <[email protected]>

References

Thi-Thu-Hong Phan, Emilie Poisson-Caillault, Alain Lefebvre, Andre Bigand. Dynamic time warping- based imputation for univariate time series data. Pattern Recognition Letters, Elsevier, 2017, <DOI:10.1016/j.patrec.2017.08.019>. <hal-01609256>

Examples

# Load package dataset
data(dataDTWBI)

# Create a query and a reference signal
query <- dataDTWBI$query
ref <- dataDTWBI$query

# Create a gap within query (10% of signal size)
query <- gapCreation(query, rate = 0.1)
data <- query$output_vector
begin_gap <- query$begin_gap
size_gap <- query$gap_size

# Fill gap using DTWBI algorithm
results_DTWBI <- DTWBI_univariate(data, t_gap = begin_gap, T_gap = size_gap)

# Plot
plot(ref, type = "l")
lines(results_DTWBI$output_vector, col = "red", lty = "dashed")

# Compute the similarity of imputed vector and reference
compute.sim(ref, results_DTWBI$output_vector)

FA2

Description

Estimates the FA2 of two univariate signals Y (imputed values) and X (true values).

Usage

compute.fa2(Y, X, verbose = F)

Arguments

Y

vector of imputed values

X

vector of true values

verbose

if TRUE, print advice about the quality of the model

Details

This function returns the value of FA2 of two vectors corresponding to univariate signals X (true values) and Y (imputed values). This FA2 corresponds to the percentage of pairs of values (xi,yix_{i}, y_{i}) satisfying the condition 0,5<=(Yi/Xi)<=20,5 <= (Y_{i}/X_{i}) <= 2. The closer FA2 is to 1, the more accurate is the imputation model. Both vectors Y and X must be of equal length, on the contrary an error will be displayed. In both input vectors, eventual NA will be exluded with a warning diplayed.

Author(s)

Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault

Examples

data(dataDTWBI)
X <- dataDTWBI[, 1] ; Y <- dataDTWBI[, 2]
compute.fa2(Y,X)
compute.fa2(Y,X, verbose = TRUE)

# By definition, if pairs of true and imputed values are zero,
# FA2 corresponding to this pair of values equals 1.
X[1] <- 0
Y[1] <- 0
compute.fa2(Y,X)

Fractional Bias (FB)

Description

Estimates the Fractional Bias (FB) of two univariate signals Y (imputed values) and X (true values).

Usage

compute.fb(Y, X, verbose = F)

Arguments

Y

vector of imputed values

X

vector of true values

verbose

if TRUE, print advice about the quality of the model

Details

This function returns the value of FB of two vectors corresponding to univariate signals, indicating whether predicted values are underestimated or overestimated compared to true values. A perfect imputation model gets FB=0FB = 0. An acceptable imputation model gives FB<=0.3FB <= 0.3. Both vectors Y and X must be of equal length, on the contrary an error will be displayed. In both input vectors, eventual NA will be exluded with a warning diplayed.

Author(s)

Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault

Examples

data(dataDTWBI)
X <- dataDTWBI[, 1] ; Y <- dataDTWBI[, 2]
compute.fb(Y,X)
compute.fb(Y,X, verbose = TRUE)

# If mean(X)=mean(Y)=0, it is impossible to estimate FB,
# unless both true and imputed values vectors are constant.
# By definition, in this case, FB = 0.
X <- rep(0, 10) ; Y <- rep(0, 10)
compute.fb(Y,X)

# If true and imputed values are not zero and are opposed, FB = Inf.
X <- rep(runif(1), 10)
Y <- -X
compute.fb(Y,X)

Fraction of Standard Deviation (FSD)

Description

Estimates the Fraction of Standard Deviation (FSD) of two univariate signals Y (imputed values) and X (true values).

Usage

compute.fsd(Y, X, verbose = F)

Arguments

Y

vector of imputed values

X

vector of true values

verbose

if TRUE, print advice about the quality of the model

Details

This function returns the value of FSD of two vectors corresponding to univariate signals. Values of FSD closer to zero indicate a better performance method for the imputation task. Both vectors Y and X must be of equal length, on the contrary an error will be displayed. In both input vectors, eventual NA will be exluded with a warning diplayed.

Author(s)

Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault

Examples

data(dataDTWBI)
X <- dataDTWBI[, 1] ; Y <- dataDTWBI[, 2]
compute.fsd(Y,X)
compute.fsd(Y,X, verbose = TRUE)

# By definition, if true and imputed values are equal and constant,
# FSD = 0.
X <- rep(runif(1), 10)
Y <- X
compute.fsd(Y,X)

# However, if true and imputed values are constant but different,
# FSD is not calculable. An error is displayed.
## Not run: 
X <- rep(runif(1), 10);Y <- rep(runif(1), 10)
compute.fsd(Y,X)
## End(Not run)

Normalized Mean Absolute Error (NMAE)

Description

Estimates the Normalized Mean Absolute Error of two univariate signals Y (imputed values) and X (true values).

Usage

compute.nmae(Y, X)

Arguments

Y

vector of imputed values

X

vector of true values

Details

This function returns the value of NMAE of two vectors corresponding to univariate signals. A lower NMAE (NMAE[0,inf]NMAE \in [0, \inf]) value indicates a better performance method for the imputation task. Both vectors Y and X must be of equal length, on the contrary an error will be displayed. In both input vectors, eventual NA will be exluded with a warning diplayed.

Author(s)

Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault

Examples

data(dataDTWBI)
X <- dataDTWBI[, 1] ; Y <- dataDTWBI[, 2]
compute.nmae(Y,X)

# If true values is a constant vector, NMAE = Inf.
# A warning is displayed and MAE is estimated instead of NMAE,
# unless true and imputed values are equal. In this case,
# by definition, NMAE = 0.
X <- rep(0, 10)
Y <- runif(10)
compute.nmae(Y,X) # MAE computed
Y <- X
compute.nmae(Y,X) # By definition, NMAE = 0

Root Mean Square Error (RMSE)

Description

Estimates the Root Mean Square Error of two univariate signals Y (imputed values) and X (true values).

Usage

compute.rmse(Y, X)

Arguments

Y

vector of imputed values

X

vector of true values

Details

This function returns the value of RMSE of two vectors corresponding to univariate signals. A lower RMSE (RMSE[0,inf]RMSE \in [0, \inf]) value indicates a better performance method for the imputation task. Both vectors Y and X must be of equal length, on the contrary an error will be displayed. In both input vectors, eventual NA will be exluded with a warning diplayed.

Author(s)

Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault

Examples

data(dataDTWBI)
X <- dataDTWBI[, 1] ; Y <- dataDTWBI[, 2]
compute.rmse(Y,X)

Similarity

Description

Estimates the percentage of similarity of two univariate signals Y (imputed values) and X (true values).

Usage

compute.sim(Y, X)

Arguments

Y

vector of imputed values

X

vector of true values

Details

This function returns the value of similarity of two vectors corresponding to univariate signals. A higher similarity (Similarity[0,1]Similarity \in [0, 1]) highlights a more accurate method for completing missing values in univariate datasets. Both vectors Y and X must be of equal length, on the contrary an error will be displayed. In both input vectors, eventual NA will be excluded with a warning diplayed.

Author(s)

Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault

Examples

data(dataDTWBI)
X <- dataDTWBI[, 1] ; Y <- dataDTWBI[, 2]
compute.sim(Y,X)

# By definition, if true values is a constant vector
# and one or more imputed values are equal to the true values,
# similarity = 1.
X <- rep(2, 10)
Y <- X
compute.sim(Y,X)

Six univariate signals as example for DTWBI package

Description

Query and ref1 are two dephased sigmoidal signals. Ref2 presents a linear decrease. Ref3 and ref4 are constant signals of value 3 and 0 respectively. Ref5 is similar to the query with small noise added.

Usage

dataDTWBI

Format

A data frame with six variables: query, ref1, ref2, ref3, ref4 and ref5.


Adaptive Feature Based Dynamic Time Warping algorithm

Description

This function estimates a distance matrix which is used as an input in dtw() function (package dtw) to align two univariate signals following Adaptative Feature Based Dynamic Time Warping algorithm (AFBDTW).

Usage

dist_afbdtw(q, r, w1 = 0.5)

Arguments

q

query vector

r

reference vector

w1

weight of local feature VS global feature. By default, w1 = 0.5, and by definition, w2 = 1 - w1.

Value

A list containing the following elements:

  • query: the query vector

  • response: the response vector

  • query_local: local feature of the query

  • response_local: local feature of the response vector

  • query_global: global feature of the query

  • response_global: global feature of the response vector

  • dist_local: distance matrix of the local feature

  • dist_local: distance matrix of the global feature

  • distAFBDTW: AFBDTW distance matrix

Author(s)

Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault

Examples

data(dataDTWBI)
X <- dataDTWBI[, 1] ; Y <- dataDTWBI[, 2]
AFBDTW_Dist <- dist_afbdtw(X, Y)

DTWBI algorithm for univariate signals

Description

Imputes values of a gap of position t_gap and size T in a univariate signal based on DTW algorithm. For more details on the method, see Phan et al. (2017) DOI: <10.1016/j.patrec.2017.08.019>. Default arguments of dtw() function are used but can be manually explicited and modified.

Usage

DTWBI_univariate(data, t_gap, T_gap, DTW_method = "DTW",
  threshold_cos = NULL, step_threshold = NULL, thresh_cos_stop = 0.8, ...)

Arguments

data

input vector containing a large and continuous gap (eventually derived from local.derivative.ddtw() function)

t_gap

location of the begining of the gap (eventually extracted from gapCreation function)

T_gap

gap size (eventually extracted from gapCreation function)

DTW_method

DTW method used for imputation ("DTW", "DDTW", "AFBDTW"). By default "DTW".

threshold_cos

threshold used to define similar sequences to the query. By default, threshold_cos=0.9995 if sequence is longer than 10'000, and threshold_cos=0.995 if shorter.

step_threshold

step used within the loop determining the threshold. By default, step_threshold=50 if sequence is longer than 10'000, step_threshold=10 if sequence length is between 1'000 and 10'000. Else, step_threshold=2.

thresh_cos_stop

Define the lowest cosine threshold acceptable to find a similar window to the query. By default, thresh_cos_stop=0.8.

...

additional arguments from the dtw() function

Value

DTWBI_univariate returns a list containing the following elements:

  • output_vector: output vector containing complete data including the imputation proposal

  • input_vector: original vector used as input

  • query: the query i.e. the adjacent sequence to the gap

  • pos_query: index of the begining and end of the query

  • sim_window: vector containing the values of the most similar sequence to the query

  • pos_sim_window: index of the begining and end of the similar window

  • imputation_window: vector containing imputed values

  • pos_imp_window: index of the begining and end of the imputation window

Author(s)

Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault

Examples

data(dataDTWBI)
X <- dataDTWBI[, 1]

rate <- 0.1
output <- gapCreation(X, rate)
data <- output$output_vector
gap_begin <- output$begin_gap
gap_size <- output$gap_size
imputed_data <- DTWBI_univariate(data, t_gap=gap_begin, T_gap=gap_size)
plot(imputed_data$input_vector, type = "l", lwd = 2) # Uncomplete signal
lines(imputed_data$output_vector, col = "red") # Imputed signal
lines(y = imputed_data$query,
      x = imputed_data$pos_query[1]:imputed_data$pos_query[2],
      col = "green", lwd = 4) # Query
lines(y = imputed_data$sim_window,
      x = imputed_data$pos_sim_window[1]:imputed_data$pos_sim_window[2],
      col = "orange", lwd = 4) # Similar sequence to the query
lines(y = imputed_data$imputation_window,
      x = imputed_data$pos_imp_window[1]:imputed_data$pos_imp_window[2],
      col = "blue", lwd = 4) # Imputing proposal

Gap creation

Description

This function creates a large continuous gap within a univariate signal. Gap size is defined as a percentage of input vector length. By default, the created gap starts at a random location.

Usage

gapCreation(X, rate, begin = NULL)

Arguments

X

input vector

rate

size of desired gap, as a percentage of input vector size

begin

location of the begining of the gap (random by default)

Value

gapCreation returns a list containing the following elements:

  • output_vector: output vector containing the created gap

  • input_vector: original vector used as input

  • begin_gap: index of the begining of the gap

  • rate: size of the created gap in percentage of the input vector length

  • gap_size: length of the created gap

Author(s)

Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault

Examples

data(dataDTWBI)
X <- dataDTWBI[, 1]
rate <- 0.1
output <- gapCreation(X, rate)
plot(output$input_vector, type = "l", col = "red", lwd = 2)
lines(output$output_vector, lty = "dashed", lwd = 2)

Local derivative estimate to compute DDTW

Description

This function estimates the local derivative of a vector. It can be used as an input in dtw() function (package dtw) to align two univariate signals.

Usage

local.derivative.ddtw(X)

Arguments

X

input vector from which local derivative has to be calculated

Author(s)

Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault

Examples

data(dataDTWBI)
X <- dataDTWBI[, 1]
local.derivative.ddtw(X)

# Plot
plot(X, type = "b", ylim = c(-1, 1))
lines(local.derivative.ddtw(X), col = "red")

DTW-based methods for univariate signals

Description

Finds the optimal alignment between two univariate time series based on DTW methods.

Usage

minCost(X, Y, method, ...)

Arguments

X

query vector

Y

response vector

method

"DTW", "DDTW", "AFBDTW", "DTW-D"

...

additional arguments from functions dtw or dist_afbdtw

Author(s)

Camille Dezecache, Hong T. T. Phan, Emilie Poisson-Caillault

Examples

data(dataDTWBI)
X <- dataDTWBI[, 1] ; Y <- dataDTWBI[, 2]

# Plot query and reference
plot(X, type = "l", ylim = c(-5,3))
lines(1:length(X), Y, col = "red")

#= Align signals using DTW
align_dtw <- minCost(X, Y, method = "DTW")
#= Align signals using DDTW
align_ddtw <- minCost(X, Y, method = "DDTW")
#= Align signals using AFBDTW
align_afbdtw <- minCost(X, Y, method = "AFBDTW")
#= Align signals using DTW-D
align_dtwd <- minCost(X, Y, method = "DTW-D")

#= Plots
library(dtw)
dtwPlotTwoWay(d = align_dtw, xts <- X, yts = Y, main = "DTW")
dtwPlotTwoWay(d = align_ddtw, xts <- X, yts = Y, main = "DDTW")
dtwPlotTwoWay(d = align_afbdtw, xts <- X, yts = Y, main = "AFBDTW")
dtwPlotTwoWay(d = align_dtwd, xts <- X, yts = Y, main = "DTW-D")

#= Compare cost of each method
comparative_cost <- matrix(c(align_dtw$normalizedDistance,
align_ddtw$normalizedDistance,
align_afbdtw$normalizedDistance,
align_dtwd$normalizedDistance), ncol = 4)
colnames(comparative_cost) <- c("DTW", "DDTW", "AFBDTW", "DTW-D")
comparative_cost