Package 'ROKET'

Title: Optimal Transport-Based Kernel Regression
Description: Perform optimal transport on somatic point mutations and kernel regression hypothesis testing by integrating pathway level similarities at the gene level (Little et al. (2023) <doi:10.1111/biom.13769>). The software implements balanced and unbalanced optimal transport and omnibus tests with 'C++' across a set of tumor samples and allows for multi-threading to decrease computational runtime.
Authors: Paul Little [aut, cre]
Maintainer: Paul Little <[email protected]>
License: GPL (>= 3)
Version: 1.0.0
Built: 2025-03-06 18:32:22 UTC
Source: CRAN

Help Index


kernTEST

Description

kernTEST

Usage

kernTEST(
  RESI = NULL,
  KK,
  YY = NULL,
  XX = NULL,
  OMNI,
  nPERMS = 1e+05,
  ncores = 1
)

Arguments

RESI

A numeric vector of null model residuals names(RESI) must be set to maintain sample ordering for survival regression, otherwise set RESI to NULL.

KK

An array containing double-centered positive semi-definite kernel matrices. Refer to MiRKAT::D2K() for transforming distance matrices to kernel matrices. The dimnames(KK)[[1]] and dimnames(KK)[[2]] must match names(RESI). Also set dimnames(KK)[[3]] to keep track of each kernel matrix.

YY

A numeric vector of continuous outcomes to be fitted in a linear model. Defaults to NULL for survival model.

XX

A numeric data matrix with first column for intercept, a column of ones.

OMNI

A matrix of zeros and ones. Each column corresponds to a distance matrix while each row corresponds to an omnibus test. Set rownames(OMNI) for labeling outputted p-values and colnames(OMNI) which should match dimnames(KK)[[3]].

nPERMS

A positive integer to specify the number of permutation-based p-value calculation

ncores

A positive integer for the number of cores/threads to reduce computational runtime when running for loops

Value

A R list of p-values and omnibus p-values.


kOT_sim_AGG

Description

kOT_sim_AGG

Usage

kOT_sim_AGG(work_dir)

Arguments

work_dir

A full path to create "sim_ROKET" and subdirectories

Value

Nothing. Png files are created within the simulation ROKET directory.


kOT_sim_make

Description

Generates simulation files

Usage

kOT_sim_make(work_dir, NN = 200, nGENE = 500, nPATH = 25, RR = 200)

Arguments

work_dir

A full path to create "sim_ROKET" and subdirectories

NN

A positive integer for sample size

nGENE

A positive integer for number of genes to simulate

nPATH

A positive integer for number of pathways to simulate

RR

A positive integer for number of replicates to simulate

Value

Nothing. Rds files are created within the simulation ROKET directory.


kOT_sim_OT

Description

kOT_sim_OT

Usage

kOT_sim_OT(work_dir, NN, nGENE, nPATH, SCEN, ncores = 1)

Arguments

work_dir

A full path to create "sim_ROKET" and subdirectories

NN

A positive integer for sample size

nGENE

A positive integer for number of genes to simulate

nPATH

A positive integer for number of pathways to simulate

SCEN

An integer taking values 1, 2, 3, or 4

ncores

A positive integer specifying the number of cores/threads to use for optimal transport calculations

Value

Nothing. Rds files are created within the simulation ROKET directory.


kOT_sim_REG

Description

kOT_sim_REG

Usage

kOT_sim_REG(work_dir, NN, nGENE, nPATH, SCEN, rr)

Arguments

work_dir

A full path to create "sim_ROKET" and subdirectories

NN

A positive integer for sample size

nGENE

A positive integer for number of genes to simulate

nPATH

A positive integer for number of pathways to simulate

SCEN

An integer taking values 1, 2, 3, or 4

rr

A positive integer indexing a replicate

Value

Nothing. A rds file is created within the simulation ROKET directory.


run_myOT

Description

Runs balanced or unbalanced optimal transport on two input vectors

Usage

run_myOT(
  XX,
  YY,
  COST,
  EPS,
  LAMBDA1,
  LAMBDA2 = NULL,
  balance = FALSE,
  conv = 1e-05,
  max_iter = 3000,
  verbose = TRUE,
  show_iter = 50
)

Arguments

XX

A numeric vector of positive masses

YY

A numeric vector of positive masses

COST

A numeric matrix of non-negative values representing the costs to transport masses between features of XX and YY. The rows of COST and features of XX need to be aligned. The columns of COST and features of YY need to be aligned.

EPS

A positive numeric value representing the tuning parameter for entropic regularization.

LAMBDA1

A non-negative numeric value representing the tuning parameter penalizing the distance between XX and the row sums of the optimal transport matrix.

LAMBDA2

A non-negative numeric value representing the tuning parameter penalizing the distance between YY and the column sums of the optimal transport matrix.

balance

Boolean set to TRUE to run balanced optimal transport regardless of LAMDA1 and LAMBDA2. Otherwise run unbalanced optimal transport.

conv

A positive numeric value to determine algorithmic convergence. The default value is 1e-5.

max_iter

A positive integer denoting the maximum iterations to run the algorithm.

verbose

Boolean value to display verbose function output.

show_iter

A positive integer to display iteration details at multiples of show_iter but only if verbose = TRUE.

Value

A R list containing the optimal transport matrix and associated distance metric.


run_myOTs

Description

run_myOTs

Usage

run_myOTs(
  ZZ,
  COST,
  EPS,
  LAMBDA1,
  LAMBDA2 = NULL,
  balance,
  conv = 1e-05,
  max_iter = 3000,
  ncores = 1,
  verbose = TRUE,
  show_iter = 50
)

Arguments

ZZ

A numeric matrix of non-negative mass to transport. Rows correspond to features (e.g. genes) and columns correspond to samples or individuals. Each column must have strictly positive mass

COST

A numeric square matrix of non-negative values representing the non-negative costs to transport masses between pairs of features

EPS

A positive numeric value representing the tuning parameter for entropic regularization.

LAMBDA1

A non-negative numeric value representing the tuning parameter penalizing the distance between XX and the row sums of the optimal transport matrix.

LAMBDA2

A non-negative numeric value representing the tuning parameter penalizing the distance between YY and the column sums of the optimal transport matrix.

balance

Boolean set to TRUE to run balanced optimal transport regardless of LAMDA1 and LAMBDA2. Otherwise run unbalanced optimal transport.

conv

A positive numeric value to determine algorithmic convergence. The default value is 1e-5.

max_iter

A positive integer denoting the maximum iterations to run the algorithm.

ncores

A positive integer for the number of cores/threads to reduce computational runtime when running for loops

verbose

Boolean value to display verbose function output.

show_iter

A positive integer to display iteration details at multiples of show_iter but only if verbose = TRUE.

Value

A R numeric matrix of pairwise distances.