Package 'gstsm'

Title: Generalized Spatial-Time Sequence Miner
Description: Implementations of the algorithms present article Generalized Spatial-Time Sequence Miner, original title (Castro, Antonio; Borges, Heraldo ; Pacitti, Esther ; Porto, Fabio ; Coutinho, Rafaelli ; Ogasawara, Eduardo . Generalização de Mineração de Sequências Restritas no Espaço e no Tempo. In: XXXVI SBBD - Simpósio Brasileiro de Banco de Dados, 2021 <doi:10.5753/sbbd.2021.17891>).
Authors: Antonio Castro [aut, cre], Cássio Souza [aut, ctb], Jorge Rodrigues [aut, ctb], Esther Pacitti [aut], Fábio Porto [aut], Florent Masseglia [aut], Rafaelli Coutinho [aut, ths], Eduardo Ogasawara [aut, ths] , Federal Center for Technological Education of Rio de Janeiro [cph]
Maintainer: Antonio Castro <[email protected]>
License: MIT + file LICENSE
Version: 1.0.0
Built: 2024-12-12 06:48:56 UTC
Source: CRAN

Help Index


Find - definition

Description

S3 class definition for find method.

Usage

find(object, ck)

Arguments

object

a GSTSM object

ck

set of candidates

Value

Solid Ranged-Group(s) of all candidate sequences


Find Kernel Ranged Group

Description

The goal is to find the Kernel Ranged Group information for a candidate c.

Usage

find_kernel_ranged_group(c, d, gamma, beta, adjacency_matrix)

Arguments

c

candidate

d

set of transactions

gamma

minimum temporal frequency

beta

minimum group size

adjacency_matrix

adjacency matrix

Value

Kernel Ranged-Group(s) of c updated


Find - default

Description

Default method for find. Does nothing.

Usage

## Default S3 method:
find(object, ck)

Arguments

object

a GSTSM object

ck

set of candidates

Value

Solid Ranged-Group(s) of all candidate sequences


Find - GSTSM implementation

Description

GSTSM implementationfor for find method. Does nothing. The goal is to find the Ranged Groups information for a candidate c.

Usage

## S3 method for class 'gstsm'
find(object, ck)

Arguments

object

a GSTSM object

ck

set of candidates

Value

Solid Ranged-Group(s) of all candidate sequences


Generate Adjacency Matrix

Description

Helper function that generates an adjacency matrix.

Usage

generate_adjacency_matrix(spatial_positions, sigma)

Arguments

spatial_positions

set of spatial positions

sigma

max distance between group points

Value

Adjacency Matrix


Generate Candidates - definition

Description

S3 class definition for generate_candidates method.

Usage

generate_candidates(object, srg)

Arguments

object

a GSTSM object

srg

set of Solid Ranged Groups

Value

candidate sequences of size k + 1


Generate Candidates - default

Description

Default method for generate_candidates. Does nothing.

Usage

## Default S3 method:
generate_candidates(object, srg)

Arguments

object

a GSTSM object

srg

set of Solid Ranged Groups

Value

candidate sequences of size k + 1


Generate Candidates - GSTSM implementation

Description

The algorithm combines SRGs that have sequences of size k, received as input, to generate candidates with sequences of size k + 1. Let x and y be SRGs, the conditions for this to occur are: that we have an intersection of candidates over the time range, intersection over the set of spatial positions (x.g n y.g), and a common subsequence: <x.s2, . . . , x.sk>=<y.s1, . . . , y.sk-1>.

Usage

## S3 method for class 'gstsm'
generate_candidates(object, srg)

Arguments

object

a GSTSM object

srg

set of Solid Ranged Groups

Value

candidate sequences of size k + 1


GSTSM

Description

S3 class definition for GSTSM.

Usage

gstsm(sts_dataset, spatial_positions, gamma, beta, sigma)

Arguments

sts_dataset

STS dataset

spatial_positions

set of spatial positions

gamma

minimum temporal frequency

beta

minimum group size

sigma

maximum distance between group points

Details

This algorithm is designed to the identification of frequent sequences in STS datasets from the concept of Solid Ranged Groups (SRG). GSTSM is based on the candidate-generating principle. The goal is to start finding SRGs for sequences of size one. Then it explores the support and the number of occurrences of SRGs for larger sequences with a limited number of scans over the database.

Value

a GSTSM object

Examples

library("gstsm")

D <- as.data.frame(matrix(c("B", "B", "A", "C", "A",
              "C", "B", "C", "A", "B",
              "C", "C", "A", "C", "A",
              "B", "B", "D", "A", "B",
              "B", "D", "D", "B", "D"
            ), nrow = 5, ncol = 5, byrow = TRUE))

ponto <- c("p1", "p2", "p3", "p4", "p5")
x <- c(1, 2, 3, 4, 5)
y <- c(0, 0, 0, 0, 0)
z <- y
P <- data.frame(ponto=ponto, x=x, y=y, z=z, stringsAsFactors = FALSE)

gamma <- 0.8
beta <- 2
sigma <- 1

gstsm_object <- gstsm(D, P, gamma, beta, sigma)

result <- mine(gstsm_object)

Merge - definition

Description

S3 class definition for merge method.

Usage

merge(object, ck)

Arguments

object

a GSTSM object

ck

set of candidates

Value

Solid Ranged-Group(s) of all candidate sequences


Merge Kernel Ranged Groups

Description

The goal is to merge KRGs. Let q and u be two different KRGs from the same candidate sequence. They can be merged into a group qu = q U u as long as they have an intersection and qu has a frequency greater than or equal to the minimum frequency defined by the user.

Usage

merge_kernel_ranged_groups(c, gamma)

Arguments

c

candidate

gamma

minimum temporal frequency

Value

KRG


Merge Kernel Ranged Groups

Description

The goal of is to stretch KRGs of the same candidate sequence. Its possible if two KRGs have intersection in space and the resulting KRG keeps its frequency equal to or greater than beta.

Usage

merge_open_kernel_ranged_groups(c, timestamp, gamma, beta, adjacency_matrix)

Arguments

c

candidate.

timestamp

current timestamp

gamma

minimum temporal frequency

beta

minimum group size

adjacency_matrix

adjacency matrix

Value

Set of updated KRGs


Merge - default

Description

Default method for merge. Does nothing.

Usage

## Default S3 method:
merge(object, ck)

Arguments

object

a GSTSM object

ck

set of candidates

Value

Solid Ranged-Group(s) of all candidate sequences


Merge - GSTSM implementation

Description

Merge - GSTSM implementation

Usage

## S3 method for class 'gstsm'
merge(object, ck)

Arguments

object

a GSTSM object

ck

set of candidates

Value

Solid Ranged-Group(s) of all candidate sequences


Mine - definition

Description

S3 class definition for mine method.

Usage

mine(object)

Arguments

object

a GSTSM object

Value

all Solid Ranged Group(s) found, of all sizes


Mine - default

Description

Default method for mine. Does nothing.

Usage

## Default S3 method:
mine(object)

Arguments

object

a GSTSM object

Value

all Solid Ranged Group(s) found, of all sizes


Mine - GSTSM implementation

Description

Mine - GSTSM implementation

Usage

## S3 method for class 'gstsm'
mine(object)

Arguments

object

a GSTSM object

Value

all Solid Ranged Group(s) found, of all sizes


Split Groups

Description

Helper function that splits groups.

Usage

split_groups(pos, adjacency_matrix)

Arguments

pos

sequence occurrence index

adjacency_matrix

possible connection between positions

Value

new set based on candidate c found in d.


Validate and Close

Description

The function receives as input the set of RGs (RG) from a candidate and the minimum size of a group (beta). It starts defining a set of elements that will be removed from the set of RGs, if it does not have the minimum group size.

Usage

validate_and_close(c, gamma, beta)

Arguments

c

candidate

gamma

minimum temporal frequency

beta

minimum group size

Value

validated Greedy-Ranged-Groups.


Validate Kernel Ranged Groups

Description

Its objective is to verify that the user thresholds were observed in each RGs, checking if they can still be stretched by keeping the frequency greater than or equal to the minimum gamma and if the minimum group size beta occurs. It takes as input a set of RGs RG of a candidate sequence, the timestamp of the start of the current sliding window timestamp, the user-defined thresholds gamma and beta.

Usage

validate_kernel_ranged_groups(c, timestamp, gamma, beta)

Arguments

c

candidate

timestamp

current timestamp

gamma

minimum temporal frequency

beta

minimum group size

Value

Validated Kernel-Ranged-Groups.