Title: | Generalized Spatial-Time Sequence Miner |
---|---|
Description: | Implementations of the algorithms present article Generalized Spatial-Time Sequence Miner, original title (Castro, Antonio; Borges, Heraldo ; Pacitti, Esther ; Porto, Fabio ; Coutinho, Rafaelli ; Ogasawara, Eduardo . Generalização de Mineração de Sequências Restritas no Espaço e no Tempo. In: XXXVI SBBD - Simpósio Brasileiro de Banco de Dados, 2021 <doi:10.5753/sbbd.2021.17891>). |
Authors: | Antonio Castro [aut, cre], Cássio Souza [aut, ctb], Jorge Rodrigues [aut, ctb], Esther Pacitti [aut], Fábio Porto [aut], Florent Masseglia [aut], Rafaelli Coutinho [aut, ths], Eduardo Ogasawara [aut, ths] , Federal Center for Technological Education of Rio de Janeiro [cph] |
Maintainer: | Antonio Castro <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.0 |
Built: | 2024-12-12 06:48:56 UTC |
Source: | CRAN |
S3 class definition for find method.
find(object, ck)
find(object, ck)
object |
a GSTSM object |
ck |
set of candidates |
Solid Ranged-Group(s) of all candidate sequences
The goal is to find the Kernel Ranged Group information for a candidate c.
find_kernel_ranged_group(c, d, gamma, beta, adjacency_matrix)
find_kernel_ranged_group(c, d, gamma, beta, adjacency_matrix)
c |
candidate |
d |
set of transactions |
gamma |
minimum temporal frequency |
beta |
minimum group size |
adjacency_matrix |
adjacency matrix |
Kernel Ranged-Group(s) of c updated
Default method for find. Does nothing.
## Default S3 method: find(object, ck)
## Default S3 method: find(object, ck)
object |
a GSTSM object |
ck |
set of candidates |
Solid Ranged-Group(s) of all candidate sequences
GSTSM implementationfor for find method. Does nothing. The goal is to find the Ranged Groups information for a candidate c.
## S3 method for class 'gstsm' find(object, ck)
## S3 method for class 'gstsm' find(object, ck)
object |
a GSTSM object |
ck |
set of candidates |
Solid Ranged-Group(s) of all candidate sequences
Helper function that generates an adjacency matrix.
generate_adjacency_matrix(spatial_positions, sigma)
generate_adjacency_matrix(spatial_positions, sigma)
spatial_positions |
set of spatial positions |
sigma |
max distance between group points |
Adjacency Matrix
S3 class definition for generate_candidates method.
generate_candidates(object, srg)
generate_candidates(object, srg)
object |
a GSTSM object |
srg |
set of Solid Ranged Groups |
candidate sequences of size k + 1
Default method for generate_candidates. Does nothing.
## Default S3 method: generate_candidates(object, srg)
## Default S3 method: generate_candidates(object, srg)
object |
a GSTSM object |
srg |
set of Solid Ranged Groups |
candidate sequences of size k + 1
The algorithm combines SRGs that have sequences of size k, received as input, to generate candidates with sequences of size k + 1. Let x and y be SRGs, the conditions for this to occur are: that we have an intersection of candidates over the time range, intersection over the set of spatial positions (x.g n y.g), and a common subsequence: <x.s2, . . . , x.sk>=<y.s1, . . . , y.sk-1>.
## S3 method for class 'gstsm' generate_candidates(object, srg)
## S3 method for class 'gstsm' generate_candidates(object, srg)
object |
a GSTSM object |
srg |
set of Solid Ranged Groups |
candidate sequences of size k + 1
S3 class definition for GSTSM.
gstsm(sts_dataset, spatial_positions, gamma, beta, sigma)
gstsm(sts_dataset, spatial_positions, gamma, beta, sigma)
sts_dataset |
STS dataset |
spatial_positions |
set of spatial positions |
gamma |
minimum temporal frequency |
beta |
minimum group size |
sigma |
maximum distance between group points |
This algorithm is designed to the identification of frequent sequences in STS datasets from the concept of Solid Ranged Groups (SRG). GSTSM is based on the candidate-generating principle. The goal is to start finding SRGs for sequences of size one. Then it explores the support and the number of occurrences of SRGs for larger sequences with a limited number of scans over the database.
a GSTSM object
library("gstsm") D <- as.data.frame(matrix(c("B", "B", "A", "C", "A", "C", "B", "C", "A", "B", "C", "C", "A", "C", "A", "B", "B", "D", "A", "B", "B", "D", "D", "B", "D" ), nrow = 5, ncol = 5, byrow = TRUE)) ponto <- c("p1", "p2", "p3", "p4", "p5") x <- c(1, 2, 3, 4, 5) y <- c(0, 0, 0, 0, 0) z <- y P <- data.frame(ponto=ponto, x=x, y=y, z=z, stringsAsFactors = FALSE) gamma <- 0.8 beta <- 2 sigma <- 1 gstsm_object <- gstsm(D, P, gamma, beta, sigma) result <- mine(gstsm_object)
library("gstsm") D <- as.data.frame(matrix(c("B", "B", "A", "C", "A", "C", "B", "C", "A", "B", "C", "C", "A", "C", "A", "B", "B", "D", "A", "B", "B", "D", "D", "B", "D" ), nrow = 5, ncol = 5, byrow = TRUE)) ponto <- c("p1", "p2", "p3", "p4", "p5") x <- c(1, 2, 3, 4, 5) y <- c(0, 0, 0, 0, 0) z <- y P <- data.frame(ponto=ponto, x=x, y=y, z=z, stringsAsFactors = FALSE) gamma <- 0.8 beta <- 2 sigma <- 1 gstsm_object <- gstsm(D, P, gamma, beta, sigma) result <- mine(gstsm_object)
S3 class definition for merge method.
merge(object, ck)
merge(object, ck)
object |
a GSTSM object |
ck |
set of candidates |
Solid Ranged-Group(s) of all candidate sequences
The goal is to merge KRGs. Let q and u be two different KRGs from the same candidate sequence. They can be merged into a group qu = q U u as long as they have an intersection and qu has a frequency greater than or equal to the minimum frequency defined by the user.
merge_kernel_ranged_groups(c, gamma)
merge_kernel_ranged_groups(c, gamma)
c |
candidate |
gamma |
minimum temporal frequency |
KRG
The goal of is to stretch KRGs of the same candidate sequence. Its possible if two KRGs have intersection in space and the resulting KRG keeps its frequency equal to or greater than beta.
merge_open_kernel_ranged_groups(c, timestamp, gamma, beta, adjacency_matrix)
merge_open_kernel_ranged_groups(c, timestamp, gamma, beta, adjacency_matrix)
c |
candidate. |
timestamp |
current timestamp |
gamma |
minimum temporal frequency |
beta |
minimum group size |
adjacency_matrix |
adjacency matrix |
Set of updated KRGs
Default method for merge. Does nothing.
## Default S3 method: merge(object, ck)
## Default S3 method: merge(object, ck)
object |
a GSTSM object |
ck |
set of candidates |
Solid Ranged-Group(s) of all candidate sequences
Merge - GSTSM implementation
## S3 method for class 'gstsm' merge(object, ck)
## S3 method for class 'gstsm' merge(object, ck)
object |
a GSTSM object |
ck |
set of candidates |
Solid Ranged-Group(s) of all candidate sequences
S3 class definition for mine method.
mine(object)
mine(object)
object |
a GSTSM object |
all Solid Ranged Group(s) found, of all sizes
Default method for mine. Does nothing.
## Default S3 method: mine(object)
## Default S3 method: mine(object)
object |
a GSTSM object |
all Solid Ranged Group(s) found, of all sizes
Mine - GSTSM implementation
## S3 method for class 'gstsm' mine(object)
## S3 method for class 'gstsm' mine(object)
object |
a GSTSM object |
all Solid Ranged Group(s) found, of all sizes
Helper function that splits groups.
split_groups(pos, adjacency_matrix)
split_groups(pos, adjacency_matrix)
pos |
sequence occurrence index |
adjacency_matrix |
possible connection between positions |
new set based on candidate c found in d.
The function receives as input the set of RGs (RG) from a candidate and the minimum size of a group (beta). It starts defining a set of elements that will be removed from the set of RGs, if it does not have the minimum group size.
validate_and_close(c, gamma, beta)
validate_and_close(c, gamma, beta)
c |
candidate |
gamma |
minimum temporal frequency |
beta |
minimum group size |
validated Greedy-Ranged-Groups.
Its objective is to verify that the user thresholds were observed in each RGs, checking if they can still be stretched by keeping the frequency greater than or equal to the minimum gamma and if the minimum group size beta occurs. It takes as input a set of RGs RG of a candidate sequence, the timestamp of the start of the current sliding window timestamp, the user-defined thresholds gamma and beta.
validate_kernel_ranged_groups(c, timestamp, gamma, beta)
validate_kernel_ranged_groups(c, timestamp, gamma, beta)
c |
candidate |
timestamp |
current timestamp |
gamma |
minimum temporal frequency |
beta |
minimum group size |
Validated Kernel-Ranged-Groups.