Package 'emcAdr' reference manual

Title:	Evolutionary Version of the Metropolis-Hastings Algorithm
Description:	Provides computational methods for detecting adverse high-order drug interactions from individual case safety reports using statistical techniques, allowing the exploration of higher-order interactions among drug cocktails.
Authors:	Jules Bangard [aut, cre]
Maintainer:	Jules Bangard <[email protected]>
License:	GPL-3
Version:	1.2
Built:	2025-02-27 18:33:33 UTC
Source:	CRAN

Evolutionary Version of the Metropolis-Hastings Algorithm

Description

Provides computational methods for detecting adverse high-order drug interactions from individual case safety reports using statistical techniques, allowing the exploration of higher-order interactions among drug cocktails.

Author(s)

Jules Bangard [aut, cre] (<https://orcid.org/0009-0007-4670-7860>)

Maintainer: Jules Bangard <[email protected]>

ATC Tree Upper Bound 2024

Description

Example dataset representing the ATC tree structure, sourced from the WHO website (2024-02-23). This dataset is provided for demonstration and testing purposes with the package.

Usage

ATC_Tree_UpperBound_2024
ATC_Tree_UpperBound_2024

Format

A data frame with 4 variables:

ATCCode: The code of ATC nodes
Name: The name of ATC nodes
ATC_length: The number of characters in the ATCCode
upperBound: The index of the last child node in the tree

Source

World Health Organization, ATC classification register

Convert ATC Code for each patients to the corresponding DFS number of the ATC tree

Description

Convert ATC Code for each patients to the corresponding DFS number of the ATC tree

Usage

ATCtoNumeric(patientATC, tree)
ATCtoNumeric(patientATC, tree)

Arguments

`patientATC`	: patients observations, for each patient we got a string containing taken medications (ATC code)
`tree`	: ATC tree (we assume that there is a column 'ATCCode' )

Value

a matrix of the same size as patientATC but containing integer that are the index of the corresponding ATC code.

Examples

 ATC_code <- c('A01AA30 A01AB03', 'A10AC30')
 ATCtoNumeric(ATC_code, ATC_Tree_UpperBound_2024)

ATC_code <- c('A01AA30 A01AB03', 'A10AC30')
 ATCtoNumeric(ATC_code, ATC_Tree_UpperBound_2024)

Calculate the divergence between 2 distributions (the true Distribution and the learned one)

Description

Calculate the divergence between 2 distributions (the true Distribution and the learned one)

Usage

calculate_divergence(
  empirical_distribution,
  true_distribution,
  method = "TV",
  Filtered = FALSE
)
calculate_divergence(
  empirical_distribution,
  true_distribution,
  method = "TV",
  Filtered = FALSE
)

Arguments

`empirical_distribution`	A numeric vector of values representing the empirical distribution (return value of DistributionAproximation function)
`true_distribution`	A numeric vector of values representing the true distribution computed by the trueDistributionSizeTwoCocktail function
`method`	A string, either "TV" or "KL" to respectively use the total variation distance or the Kullback-Leibler divergence. (default = "TV")
`Filtered`	Should we use the filtered distribution or the normal one

Value

A numeric value representing the divergence of the 2 distributions

Examples


data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

estimated_score_distribution = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy[1:100,], Smax =2)
            
true_score_distribution = trueDistributionSizeTwoCocktail(ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy[1:100,], beta = 4)

divergence <- calculate_divergence(empirical_distribution = estimated_score_distribution,
                true_distribution = true_score_distribution)

data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

estimated_score_distribution = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy[1:100,], Smax =2)
            
true_score_distribution = trueDistributionSizeTwoCocktail(ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy[1:100,], beta = 4)

divergence <- calculate_divergence(empirical_distribution = estimated_score_distribution,
                true_distribution = true_score_distribution)

Clustering of the solutions of the genetic algorithm using the hclust algorithm

Description

Clustering of the solutions of the genetic algorithm using the hclust algorithm

Usage

clustering_genetic_algorithm(
  genetic_results,
  ATCtree,
  dist.normalize = TRUE,
  umap_config = NULL
)
clustering_genetic_algorithm(
  genetic_results,
  ATCtree,
  dist.normalize = TRUE,
  umap_config = NULL
)

Arguments

`genetic_results`	A list of cocktails in the form of integer vector
`ATCtree`	ATC tree with upper bound of the DFS
`dist.normalize`	Do we normalize the distance (so it belongs to [0;1])
`umap_config`	The configuration to use in order to project the cocktails in a smaller space (umap::umap.defaults by default)

Value

A dataframe containing UMAP 1/2 the two coordinates of each cocktails in the plane as well as the cluster number of each cocktails

Examples


 data("ATC_Tree_UpperBound_2024")

 results = GeneticAlgorithm(epochs = 10, nbIndividuals = 10, 
            ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy)

 hclust_genetic_solution(genetic_results = results,
                 ATCtree = ATC_Tree_UpperBound_2024)

data("ATC_Tree_UpperBound_2024")

 results = GeneticAlgorithm(epochs = 10, nbIndividuals = 10, 
            ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy)

 hclust_genetic_solution(genetic_results = results,
                 ATCtree = ATC_Tree_UpperBound_2024)

Function used to compute the Hypergeometric score on a list of cocktails

Description

Function used to compute the Hypergeometric score on a list of cocktails

Usage

compute_hypergeom_on_list(cocktails, ATCtree, observations, num_thread = 1L)
compute_hypergeom_on_list(cocktails, ATCtree, observations, num_thread = 1L)

Arguments

`cocktails`	: A list containing cocktails in the form of vector of integers (ATC index)
`ATCtree`	: ATC tree with upper bound of the DFS (without the root)
`observations`	: observation of the AE based on the medications of each patients (a DataFrame containing the medication on the first column and the ADR (boolean) on the second) on which we want to compute the risk distribution
`num_thread`	: Number of thread to run in parallel if openMP is available, 1 by default

Value

Hypergeometric score among "cocktails" parameters

Examples


data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

cocktails = list(c(561, 904),
               c(1902, 4585))

Hypergeom_of_cocktails = compute_hypergeom_on_list(cocktails = cocktails,
                              ATCtree = ATC_Tree_UpperBound_2024, 
                              observations = FAERS_myopathy)

data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

cocktails = list(c(561, 904),
               c(1902, 4585))

Hypergeom_of_cocktails = compute_hypergeom_on_list(cocktails = cocktails,
                              ATCtree = ATC_Tree_UpperBound_2024, 
                              observations = FAERS_myopathy)

Function used to compute the Relative Risk on a list of cocktails

Description

Function used to compute the Relative Risk on a list of cocktails

Usage

compute_RR_on_list(cocktails, ATCtree, observations, num_thread = 1L)
compute_RR_on_list(cocktails, ATCtree, observations, num_thread = 1L)

Arguments

`cocktails`	: A list containing cocktails in the form of vector of integers (ATC index)
`ATCtree`	: ATC tree with upper bound of the DFS (without the root)
`observations`	: observation of the AE based on the medications of each patients (a DataFrame containing the medication on the first column and the ADR (boolean) on the second) on which we want to compute the risk distribution
`num_thread`	: Number of thread to run in parallel if openMP is available, 1 by default

Value

RR score among "cocktails" parameters

Examples


data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

cocktails = list(c(561, 904),
               c(1902, 4585))

RR_of_cocktails = compute_RR_on_list(cocktails = cocktails,
                              ATCtree = ATC_Tree_UpperBound_2024, 
                              observations = FAERS_myopathy)

data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

cocktails = list(c(561, 904),
               c(1902, 4585))

RR_of_cocktails = compute_RR_on_list(cocktails = cocktails,
                              ATCtree = ATC_Tree_UpperBound_2024, 
                              observations = FAERS_myopathy)

Function used in the reference article to compare diverse Disproportionality Analysis metrics

Description

Function used in the reference article to compare diverse Disproportionality Analysis metrics

Usage

computeMetrics_size2(CocktailList, ATCtree, observations, num_thread = 1L)
computeMetrics_size2(CocktailList, ATCtree, observations, num_thread = 1L)

Arguments

`CocktailList`	: A list of cocktails on which the Disproportionality analysis metrics should be computed
`ATCtree`	: ATC tree with upper bound of the DFS (without the root)
`observations`	: observation of the AE based on the medications of each patients (a DataFrame containing the medication on the first column and the ADR (boolean) on the second) on which we want to compute the risk distribution
`num_thread`	: Number of thread to run in parallel if openMP is available, 1 by default

Value

Multiple DA metrics computed on CocktailList cocktails

Examples


data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

cocktails = list(c(561, 904),
               c(1902, 4585)) # only size 2 cocktails allowed for this function

scores_of_cocktails = computeMetrics_size2(CocktailList = cocktails,
                              ATCtree = ATC_Tree_UpperBound_2024, 
                              observations = FAERS_myopathy[1:100,])

data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

cocktails = list(c(561, 904),
               c(1902, 4585)) # only size 2 cocktails allowed for this function

scores_of_cocktails = computeMetrics_size2(CocktailList = cocktails,
                              ATCtree = ATC_Tree_UpperBound_2024, 
                              observations = FAERS_myopathy[1:100,])

Function used to convert your genetic algorithm results that are stored into a .csv file to a Data structure that can be used by the clustering algorithm

Description

Function used to convert your genetic algorithm results that are stored into a .csv file to a Data structure that can be used by the clustering algorithm

Usage

csv_to_population(ATC_name, filename, sep = ";")
csv_to_population(ATC_name, filename, sep = ";")

Arguments

`ATC_name`	the ATC_name column of the ATC tree
`filename`	Name of the file where the results are located
`sep`	the separator to use when opening the csv file (';' by default)

Value

An R List that can be used by other algorithms (e.g. clustering algorithm)

Examples


  data("ATC_Tree_UpperBound_2024")
  genetic_results = csv_to_population(ATC_Tree_UpperBound_2024$Name,
                    "path/to/output.csv")

data("ATC_Tree_UpperBound_2024")
  genetic_results = csv_to_population(ATC_Tree_UpperBound_2024$Name,
                    "path/to/output.csv")

The MCMC method that runs the random walk on a single cocktail in order to estimate the distribution of score among cocktails of size Smax.

Description

The MCMC method that runs the random walk on a single cocktail in order to estimate the distribution of score among cocktails of size Smax.

Usage

DistributionApproximation(
  epochs,
  ATCtree,
  observations,
  temperature = 1L,
  nbResults = 5L,
  Smax = 2L,
  p_type1 = 0.01,
  beta = 4L,
  max_score = 500L,
  num_thread = 1L,
  verbose = FALSE
)
DistributionApproximation(
  epochs,
  ATCtree,
  observations,
  temperature = 1L,
  nbResults = 5L,
  Smax = 2L,
  p_type1 = 0.01,
  beta = 4L,
  max_score = 500L,
  num_thread = 1L,
  verbose = FALSE
)

Arguments

`epochs`	: number of steps for the MCMC algorithm
`ATCtree`	: ATC tree with upper bound of the DFS (without the root, also see on the github repo for an example)
`observations`	: real observation of the AE based on the medications of each patients (a DataFrame containing the medication on the first column and the ADR (boolean) on the second)
`temperature`	: starting temperature, default = 1 (denoted T in the article)
`nbResults`	: Number of returned solution (Cocktail of size Smax with the best oberved score during the run), 5 by default
`Smax`	: Size of the cocktail we approximate the distribution from
`p_type1`	: probability to operate type1 mutation. Note : the probability to operate the type 2 mutation is then 1 - P_type1. P_type1 must be in [0;1]. Default is .01
`beta`	: filter the minimum number of patients that must have taken the cocktail for his risk to be taken into account in the DistributionScoreBeta default is 4
`max_score`	: maximum number the score can take. Score greater than this one would be added to the distribution as the value max_score. Default is 500
`num_thread`	: Number of thread to run in parallel if openMP is available, 1 by default
`verbose`	: Output summary (default is false)

Value

I no problem, return a List containing : - ScoreDistribution : the distribution of the score as an array with each cells representing the number of risks = (index-1)/ 10 - Outstanding_score : An array of the score greater than max_score, - Best_cocktails : the nbResults bests cocktails encountered during the run. - Best_scores : Score corresponding to the bestCocktails. - FilteredDistribution : Distribution containing score for cocktails taken by at least beta patients. - Best_cocktails_beta : the nbResults bests cocktails taken by at least beta patients encountered during the run. - Best_scores_beta : Score corresponding to the bestCocktailsBeta. - cocktailSize : Smax parameter used during the run. ; Otherwise the list is empty

Examples


data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

estimation = DistributionApproximation(epochs = 10, ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy)

data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

estimation = DistributionApproximation(epochs = 10, ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy)

FAERS Myopathy Dataset

Description

Example dataset representing drug intake and adverse event reports from FAERS. This dataset is provided to demonstrate the functionality of genetic and MCMC algorithms in the package.

Usage

FAERS_myopathy
FAERS_myopathy

Format

A data frame with 2 columns:

patientATC: Drug intake for each patient as a vector of ATC tree indices
patientADR: Indicates if the patient experienced myopathy as an adverse event

Source

Food & Drug Administration Event Reporting System (FAERS)

Genetic algorithm, trying to reach riskiest cocktails (the ones which maximize the fitness function, Hypergeometric score in our case)

Description

Genetic algorithm, trying to reach riskiest cocktails (the ones which maximize the fitness function, Hypergeometric score in our case)

Usage

GeneticAlgorithm(
  epochs,
  nbIndividuals,
  ATCtree,
  observations,
  num_thread = 1L,
  diversity = FALSE,
  p_crossover = 0.8,
  p_mutation = 0.01,
  nbElite = 0L,
  tournamentSize = 2L,
  alpha = 1,
  summary = TRUE
)
GeneticAlgorithm(
  epochs,
  nbIndividuals,
  ATCtree,
  observations,
  num_thread = 1L,
  diversity = FALSE,
  p_crossover = 0.8,
  p_mutation = 0.01,
  nbElite = 0L,
  tournamentSize = 2L,
  alpha = 1,
  summary = TRUE
)

Arguments

`epochs`	: number of step or the algorithm
`nbIndividuals`	: size of the population
`ATCtree`	: ATC tree with upper bound of the DFS (without the root)
`observations`	: real observation of the AE based on the medications of each patients (a DataFrame containing the medication on the first column and the ADR (boolean) on the second)
`num_thread`	: Number of thread to run in parallel if openMP is available, 1 by default
`diversity`	: enable the diversity mechanism of the algorithm (favor the diversity of cocktail in the population), default is false
`p_crossover`	: probability to operate a crossover on the crossover phase. Default is 80%
`p_mutation`	: probability to operate a mutation after the crossover phase. Default is 1%
`nbElite`	: number of best individual we keep from generation to generation. Default is 0
`tournamentSize`	: size of the tournament (select the best individual between tournamentSize sampled individuals)
`alpha`	: when making a type 1 mutation you have (alpha / size of cocktail) chance to add a drug.
`summary`	: print the summary of population at each steps ?

Value

If no problem, return a List : - meanFitnesses : The mean score of the population at each epochs of the algorithm. - BestFitnesses : The best score of the population at each epochs of the algorithm. - FinalPopulation : The final population of the algorithm when finished (medications and corresponding scores)

Examples


data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

results = GeneticAlgorithm(epochs = 10, nbIndividuals = 10, 
            ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy)

data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

results = GeneticAlgorithm(epochs = 10, nbIndividuals = 10, 
            ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy)

Recover the square matrix of distance between cocktails where the index (i,j) of the matrix is the distance between cocktails i and j in an arbitrary cocktail list

Description

Recover the square matrix of distance between cocktails where the index (i,j) of the matrix is the distance between cocktails i and j in an arbitrary cocktail list

Usage

get_dissimilarity_from_cocktail_list(cocktails, ATCtree, normalization = TRUE)
get_dissimilarity_from_cocktail_list(cocktails, ATCtree, normalization = TRUE)

Arguments

`cocktails`	: A list of cocktails in the form of a vector of integer
`ATCtree`	: ATC tree with upper bound of the DFS (without the root)
`normalization`	: Do we keep the distance between cocktail in the range [0;1] ?

Value

The square matrix of distances between cocktails

Examples


data("ATC_Tree_UpperBound_2024")

cocktails = list(c(561, 904),
               c(1902, 4585)) # only size 2 cocktails allowed for this function

distance_matrix = get_dissimilarity_from_cocktail_list(cocktails = cocktails,
                              ATCtree = ATC_Tree_UpperBound_2024, 
                              normalization = TRUE)

data("ATC_Tree_UpperBound_2024")

cocktails = list(c(561, 904),
               c(1902, 4585)) # only size 2 cocktails allowed for this function

distance_matrix = get_dissimilarity_from_cocktail_list(cocktails = cocktails,
                              ATCtree = ATC_Tree_UpperBound_2024, 
                              normalization = TRUE)

Recover the square matrix of distance between cocktails where the index (i,j) of the matrix is the distance between cocktails i and j in the genetic_results list.

Description

Recover the square matrix of distance between cocktails where the index (i,j) of the matrix is the distance between cocktails i and j in the genetic_results list.

Usage

get_dissimilarity_from_genetic_results(genetic_results, ATCtree, normalization)
get_dissimilarity_from_genetic_results(genetic_results, ATCtree, normalization)

Arguments

`genetic_results`	the List returned by the genetic algorithm.
`ATCtree`	: ATC tree with upper bound of the DFS (without the root)
`normalization`	: Do we keep the distance between cocktail in the range [0;1] ?

Value

The square matrix of distances between cocktails

Examples


 data("ATC_Tree_UpperBound_2024")
 data("FAERS_myopathy")
 
 genetic_results = GeneticAlgorithm(epochs = 10, nbIndividuals = 10,
            ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy)
 distance_matrix = get_dissimilarity_from_genetic_results(genetic_results = genetic_results,
                        ATCtree = ATC_Tree_UpperBound_2024, normalization = TRUE)

data("ATC_Tree_UpperBound_2024")
 data("FAERS_myopathy")
 
 genetic_results = GeneticAlgorithm(epochs = 10, nbIndividuals = 10,
            ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy)
 distance_matrix = get_dissimilarity_from_genetic_results(genetic_results = genetic_results,
                        ATCtree = ATC_Tree_UpperBound_2024, normalization = TRUE)

Recover the square matrix of distance between cocktails where the index (i,j) of the matrix is the distance between cocktails i and j in the csv file containing results of genetic algorithm

Description

Recover the square matrix of distance between cocktails where the index (i,j) of the matrix is the distance between cocktails i and j in the csv file containing results of genetic algorithm

Usage

get_dissimilarity_from_txt_file(filename, ATCtree, normalization = TRUE)
get_dissimilarity_from_txt_file(filename, ATCtree, normalization = TRUE)

Arguments

`filename`	: the name of the file returned by the print_csv function.
`ATCtree`	: ATC tree with upper bound of the DFS (without the root)
`normalization`	: Do we keep the distance between cocktail in the range [0;1] ?

Value

The square matrix of distances between cocktails

Examples


 data("ATC_Tree_UpperBound_2024")
 
 distance_matrix = get_dissimilarity_from_txt_file(filename = '250e_700ind_0.2mr_0ne_2alpha.txt',
                        ATCtree = ATC_Tree_UpperBound_2024, normalization = TRUE)

data("ATC_Tree_UpperBound_2024")
 
 distance_matrix = get_dissimilarity_from_txt_file(filename = '250e_700ind_0.2mr_0ne_2alpha.txt',
                        ATCtree = ATC_Tree_UpperBound_2024, normalization = TRUE)

Clustering of the solutions of the genetic algorithm using the hclust algorithm

Description

Clustering of the solutions of the genetic algorithm using the hclust algorithm

Usage

hclust_genetic_solution(
  genetic_results,
  ATCtree,
  dist.normalize = TRUE,
  method = "complete"
)
hclust_genetic_solution(
  genetic_results,
  ATCtree,
  dist.normalize = TRUE,
  method = "complete"
)

Arguments

`genetic_results`	The return value of the genetic algorithm
`ATCtree`	ATC tree with upper bound of the DFS
`dist.normalize`	Do we normalize the distance (so it bellongs to [0;1])
`method`	(from hclust function) the agglomeration method to be used. This should be (an unambiguous abbreviation of) one of "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC).

Value

the hierarchical clustering of the results of the genetic algorithm

Examples


data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

results = GeneticAlgorithm(epochs = 10, nbIndividuals = 10, 
            ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy)

hclust_genetic_solution(genetic_results = results,
                 ATCtree = ATC_Tree_UpperBound_2024)

data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

results = GeneticAlgorithm(epochs = 10, nbIndividuals = 10, 
            ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy)

hclust_genetic_solution(genetic_results = results,
                 ATCtree = ATC_Tree_UpperBound_2024)

Convert the histogram returned by the DistributionApproximation function, to a real number distribution (that can be used in a test for example)

Description

Convert the histogram returned by the DistributionApproximation function, to a real number distribution (that can be used in a test for example)

Usage

histogramToDitribution(vec)
histogramToDitribution(vec)

Arguments

vec

: distribution returned by the DistributionAproximationFunction

Value

A vector containing sampled risk during the MCMC algorithm

Examples


data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

 DistributionApproximationResults = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024, observations = FAERS_myopathy)
  histogramToDitribution(DistributionApproximationResults$ScoreDistribution)

data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

 DistributionApproximationResults = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024, observations = FAERS_myopathy)
  histogramToDitribution(DistributionApproximationResults$ScoreDistribution)

This function can be used in order to try different set of parameters for the genetic algorithm in a convenient way. This will run each combination of mutation_rate, nb_elite and alphas possible nb_test_desired times. For each sets of parameters, results will be saved in a file named according to the set of parameter. One can regroup the results of each run in a csv file by using the print_csv function specifying the names of each file that needs to be treated and the number of performed runs on each parameter set

Description

This function can be used in order to try different set of parameters for the genetic algorithm in a convenient way. This will run each combination of mutation_rate, nb_elite and alphas possible nb_test_desired times. For each sets of parameters, results will be saved in a file named according to the set of parameter. One can regroup the results of each run in a csv file by using the print_csv function specifying the names of each file that needs to be treated and the number of performed runs on each parameter set

Usage

hyperparam_test_genetic_algorithm(
  epochs,
  nb_individuals,
  ATCtree,
  observations,
  nb_test_desired,
  mutation_rate,
  nb_elite,
  alphas,
  path = "./",
  num_thread = 1L
)
hyperparam_test_genetic_algorithm(
  epochs,
  nb_individuals,
  ATCtree,
  observations,
  nb_test_desired,
  mutation_rate,
  nb_elite,
  alphas,
  path = "./",
  num_thread = 1L
)

Arguments

`epochs`	: the number of epochs for the genetic algorithm
`nb_individuals`	: the size of the population in the genetic algorithm
`ATCtree`	: ATC tree with upper bound of the DFS (without the root)
`observations`	: observation of the AE based on the medications of each patients (a DataFrame containing the medication on the first column and the ADR (boolean) on the second) on which we want to compute the risk distribution
`nb_test_desired`	: number of genetic algorithm runs on each sets of parameters
`mutation_rate`	: a vector with each mutation_rate to be tested
`nb_elite`	: a vector with each nb_elite to be tested
`alphas`	: a vector with each alphas to be tested
`path`	: the path where the resulting files should be written
`num_thread`	: Number of thread to run in parallel if openMP is available, 1 by default

Value

No return value, this function should output results of the runs of the genetic algorithm in a specific format supported by function print_csv and p_value_csv_file. The files are outputed in path which is current directory by default.

Examples


data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

# different parameter to test for
mutation_rate = c(.1,.2,.3)
nb_elite = c(0,1,2)
alphas = c(0.5,1,2)
hyperparam_test_genetic_algorithm(epochs = 2, nb_individuals = 2,
                              ATCtree = ATC_Tree_UpperBound_2024, 
                              observations = FAERS_myopathy,
                              nb_test_desired = 5, mutation_rate = mutation_rate,
                              nb_elite = nb_elite, alphas = alphas)

data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

# different parameter to test for
mutation_rate = c(.1,.2,.3)
nb_elite = c(0,1,2)
alphas = c(0.5,1,2)
hyperparam_test_genetic_algorithm(epochs = 2, nb_individuals = 2,
                              ATCtree = ATC_Tree_UpperBound_2024, 
                              observations = FAERS_myopathy,
                              nb_test_desired = 5, mutation_rate = mutation_rate,
                              nb_elite = nb_elite, alphas = alphas)

Function used to convert integer cocktails (like the one outputed by the distributionApproximation function) to string cocktail in order to make them more readable

Description

Function used to convert integer cocktails (like the one outputed by the distributionApproximation function) to string cocktail in order to make them more readable

Usage

int_cocktail_to_string_cocktail(cocktails, ATC_name)
int_cocktail_to_string_cocktail(cocktails, ATC_name)

Arguments

`cocktails`	cocktails vector to be converted (index in the ATC tree)
`ATC_name`	The ATC_name column of the ATC tree

Value

The name of integer cocktails in cocktails

Examples


  data("ATC_Tree_UpperBound_2024")
  int_list = list(c(561, 904),
               c(1902, 4585))
  int_cocktail_to_string_cocktail(int_list, ATC_Tree_UpperBound_2024$Name)

data("ATC_Tree_UpperBound_2024")
  int_list = list(c(561, 904),
               c(1902, 4585))
  int_cocktail_to_string_cocktail(int_list, ATC_Tree_UpperBound_2024$Name)

Output the outstanding score (Outstanding_score) outputed by the MCMC algorithm in a special format

Description

Output the outstanding score (Outstanding_score) outputed by the MCMC algorithm in a special format

Usage

OutsandingScoreToDistribution(outstanding_score, max_score)
OutsandingScoreToDistribution(outstanding_score, max_score)

Arguments

`outstanding_score`	: Outstanding_score outputed by MCMC algorithm to be converted to the ScoreDistribution format
`max_score`	: max_score parameter used during the MCMC algorithm

Value

outstanding_score in a format compatible with MCMC algorithm output

Examples


 data("ATC_Tree_UpperBound_2024")
 data("FAERS_myopathy")

  DistributionApproximationResults = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024, observations = FAERS_myopathy)
  OutsandingScoreToDistribution(DistributionApproximationResults$Outstanding_score, max_score = 100)

data("ATC_Tree_UpperBound_2024")
 data("FAERS_myopathy")

  DistributionApproximationResults = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024, observations = FAERS_myopathy)
  OutsandingScoreToDistribution(DistributionApproximationResults$Outstanding_score, max_score = 100)

Used to add the p_value to each cocktail of cocktail list

Description

Used to add the p_value to each cocktail of cocktail list

Usage

p_value_cocktails(
  distribution_outputs,
  cocktails,
  ATCtree,
  observations,
  num_thread = 1L,
  filtred_distribution = FALSE
)
p_value_cocktails(
  distribution_outputs,
  cocktails,
  ATCtree,
  observations,
  num_thread = 1L,
  filtred_distribution = FALSE
)

Arguments

`distribution_outputs`	A list of distribution of cocktails of different sizes in order to compute the p_value for multiple cocktail sizes
`cocktails`	A list containing cocktails in the form of vector of integers (ATC index)
`ATCtree`	ATC tree with upper bound of the DFS (without the root)
`observations`	observation of the AE based on the medications of each patients (a DataFrame containing the medication on the first column and the ADR (boolean) on the second) on which we want to compute the risk distribution
`num_thread`	Number of thread to run in parallel if openMP is available, 1 by default
`filtred_distribution`	Does the p-values have to be computed using filtered distribution or normal distribution (filtered distribution by default)

Value

A real valued number vector representing the p-value of the inputed cocktails computed on the distribution_outputs List.

Examples


 data("ATC_Tree_UpperBound_2024")
 data("FAERS_myopathy")
 
  DistributionApproximationResults_size2 = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024, observations = FAERS_myopathy, Smax = 2)
            
  DistributionApproximationResults_size3 = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024, observations = FAERS_myopathy, Smax = 3)
            
  score_distribution_list = list(DistributionApproximationResults_size2,
                              DistributionApproximationResults_size3)

  cocktails = list(c(561, 904),
               c(1902, 4585))
 
  p_value_cocktails(score_distribution_list, cocktails, ATC_Tree_UpperBound_2024,
                    FAERS_myopathy)

data("ATC_Tree_UpperBound_2024")
 data("FAERS_myopathy")
 
  DistributionApproximationResults_size2 = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024, observations = FAERS_myopathy, Smax = 2)
            
  DistributionApproximationResults_size3 = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024, observations = FAERS_myopathy, Smax = 3)
            
  score_distribution_list = list(DistributionApproximationResults_size2,
                              DistributionApproximationResults_size3)

  cocktails = list(c(561, 904),
               c(1902, 4585))
 
  p_value_cocktails(score_distribution_list, cocktails, ATC_Tree_UpperBound_2024,
                    FAERS_myopathy)

Used to add the p_value to each cocktail of a csv_file that is an output of the genetic algorithm

Description

Used to add the p_value to each cocktail of a csv_file that is an output of the genetic algorithm

Usage

p_value_csv_file(
  distribution_outputs,
  filename,
  filtred_distribution = FALSE,
  sep = ";"
)
p_value_csv_file(
  distribution_outputs,
  filename,
  filtred_distribution = FALSE,
  sep = ";"
)

Arguments

`distribution_outputs`	A list of distribution of cocktails of different sizes in order to compute the p_value for multiple cocktail sizes
`filename`	The file name of the .csv file containing the output
`filtred_distribution`	Does the p-values have to be computed using filtered distribution or normal distribution (filtered distribution by default)
`sep`	The separator used in the csv file (';' by default)

Value

A real valued number vector representing the p-value of the inputed csv file filename, computed on the distribution_outputs List.

Examples


 data("ATC_Tree_UpperBound_2024")
 data("FAERS_myopathy")

  DistributionApproximationResults_size2 = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024, observations = FAERS_myopathy, Smax = 2)
            
  DistributionApproximationResults_size3 = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024, observations = FAERS_myopathy, Smax = 3)
            
  score_distribution_list = list(DistributionApproximationResults_size2,
                              DistributionApproximationResults_size3)
  p_value_csv_file(score_distribution_list, "path/to/output.csv")

data("ATC_Tree_UpperBound_2024")
 data("FAERS_myopathy")

  DistributionApproximationResults_size2 = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024, observations = FAERS_myopathy, Smax = 2)
            
  DistributionApproximationResults_size3 = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024, observations = FAERS_myopathy, Smax = 3)
            
  score_distribution_list = list(DistributionApproximationResults_size2,
                              DistributionApproximationResults_size3)
  p_value_csv_file(score_distribution_list, "path/to/output.csv")

Used to add the p_value to each cocktail of an output of the genetic algorithm

Description

Used to add the p_value to each cocktail of an output of the genetic algorithm

Usage

p_value_genetic_results(
  distribution_outputs,
  genetic_results,
  filtred_distribution = FALSE
)
p_value_genetic_results(
  distribution_outputs,
  genetic_results,
  filtred_distribution = FALSE
)

Arguments

`distribution_outputs`	A list of distribution of cocktails of different sizes in order to compute the p_value for multiple cocktail sizes
`genetic_results`	outputs of the genetic algorithm
`filtred_distribution`	Does the p-values have to be computed using filtered distribution or normal distribution (filtered distribution by default)

Value

A real valued number vector representing the p-value of the inputed genetic algorithm results (genetic_results) computed on the distribution_outputs List.

Examples


 data("ATC_Tree_UpperBound_2024")
 data("FAERS_myopathy")
  DistributionApproximationResults_size2 = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024, observations = FAERS_myopathy, Smax = 2)
            
  DistributionApproximationResults_size3 = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024, observations = FAERS_myopathy, Smax = 3)
            
  score_distribution_list = list(DistributionApproximationResults_size2,
                              DistributionApproximationResults_size3)
  genetic_results = GeneticAlgorithm(epochs = 10, nbIndividuals = 20, 
            ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy)
  p_value_genetic_results(score_distribution_list, genetic_results)

data("ATC_Tree_UpperBound_2024")
 data("FAERS_myopathy")
  DistributionApproximationResults_size2 = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024, observations = FAERS_myopathy, Smax = 2)
            
  DistributionApproximationResults_size3 = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024, observations = FAERS_myopathy, Smax = 3)
            
  score_distribution_list = list(DistributionApproximationResults_size2,
                              DistributionApproximationResults_size3)
  genetic_results = GeneticAlgorithm(epochs = 10, nbIndividuals = 20, 
            ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy)
  p_value_genetic_results(score_distribution_list, genetic_results)

Calculate p-value of sampled value

Description

Calculate p-value of sampled value

Usage

p_value_on_sampled(
  empirical_distribution,
  sampled_values,
  isFiltered = FALSE,
  includeZeroValue = FALSE
)
p_value_on_sampled(
  empirical_distribution,
  sampled_values,
  isFiltered = FALSE,
  includeZeroValue = FALSE
)

Arguments

`empirical_distribution`	A numeric vector of values representing the empirical distribution (return value of DistributionAproximation function)
`sampled_values`	A scalar or a vector of real valued number representing the sampled value (score to be tested)
`isFiltered`	A boolean representing if we want to use the filtered distribution or the distribution as is (False by default)
`includeZeroValue`	A boolean that indicate if you want to take into account the null score (False by default)

Value

A numeric value representing the empirical p-value

Examples


data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

cocktails = list(c(561, 904),
               c(1902, 4585))
               
estimated_score_distribution = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy)
            
Hypergeom_of_cocktails = compute_hypergeom_on_list(cocktails = cocktails,
                              ATCtree = ATC_Tree_UpperBound_2024, 
                              observations = FAERS_myopathy)
            
p_value = p_value_on_sampled(empirical_distribution = estimated_score_distribution,
      sampled_values = Hypergeom_of_cocktails)

data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

cocktails = list(c(561, 904),
               c(1902, 4585))
               
estimated_score_distribution = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy)
            
Hypergeom_of_cocktails = compute_hypergeom_on_list(cocktails = cocktails,
                              ATCtree = ATC_Tree_UpperBound_2024, 
                              observations = FAERS_myopathy)
            
p_value = p_value_on_sampled(empirical_distribution = estimated_score_distribution,
      sampled_values = Hypergeom_of_cocktails)

Plot the evolution of the mean and the best value of the population used by the GeneticAlgorithm

Description

Plot the evolution of the mean and the best value of the population used by the GeneticAlgorithm

Usage

plot_evolution(
  list,
  mean_color = "#F2A900",
  best_color = "#008080",
  xlab = "Epochs",
  ylab = "Score"
)
plot_evolution(
  list,
  mean_color = "#F2A900",
  best_color = "#008080",
  xlab = "Epochs",
  ylab = "Score"
)

Arguments

`list`	A list with 2 elements returned by the GeneticAlgorithm: "mean" and "best", containing the numeric vectors representing the mean and best fitness of the population
`mean_color`	A string specifying the color of the mean values
`best_color`	A string specifying the color of the best values
`xlab`	A string specifying the label for the x-axis
`ylab`	A string specifying the label for the y-axis

Value

no returned value, should plot the evolution of the genetic algorithm results (mean/max score for each epoch).

Examples


data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

results = GeneticAlgorithm(epochs = 10, nbIndividuals = 10, 
            ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy)

plot_evolution(list = results)

data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

results = GeneticAlgorithm(epochs = 10, nbIndividuals = 10, 
            ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy)

plot_evolution(list = results)

Plot the histogram of the approximation of the RR distribution

Description

Plot the histogram of the approximation of the RR distribution

Usage

plot_frequency(
  estimated,
  sqrt = FALSE,
  binwidth = 0.1,
  hist_color = "#69b3a2",
  density_color = "#FF5733",
  xlab = "Score"
)
plot_frequency(
  estimated,
  sqrt = FALSE,
  binwidth = 0.1,
  hist_color = "#69b3a2",
  density_color = "#FF5733",
  xlab = "Score"
)

Arguments

`estimated`	The ScoreDistribution element in the list returned by the DistributionApproximation function
`sqrt`	A Boolean to specify whether we normalize the estimated or not, it is recommended on large random walk.
`binwidth`	The width of the histogram bins
`hist_color`	The fill color for the histogram bars
`density_color`	The color for the density curve
`xlab`	Label of X axis

Value

no returned value, should plot the histogram of the estimated distribution (estimated).

Examples


data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

estimation = DistributionApproximation(epochs = 10, ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy)

plot_frequency(estimated = estimation$ScoreDistribution)

data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

estimation = DistributionApproximation(epochs = 10, ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy)

plot_frequency(estimated = estimation$ScoreDistribution)

Print every cocktails found during the genetic algorithm when used with the hyperparam_test_genetic_algorithm function. This enables to condense the solutions found in each files by collapsing similar cocktail in a single row by cocktail.

Description

Print every cocktails found during the genetic algorithm when used with the hyperparam_test_genetic_algorithm function. This enables to condense the solutions found in each files by collapsing similar cocktail in a single row by cocktail.

Usage

print_csv(
  input_filenames,
  observations,
  repetition,
  ATCtree,
  csv_filename = "solutions.csv"
)
print_csv(
  input_filenames,
  observations,
  repetition,
  ATCtree,
  csv_filename = "solutions.csv"
)

Arguments

`input_filenames`	: A List containing filename of hyperparam_test_genetic_algorithm output file
`observations`	: observation of the AE based on the medications of each patients (a DataFrame containing the medication on the first column and the ADR (boolean) on the second) on which we want to compute the risk distribution
`repetition`	: The parameter nb_test_desired used in the hyperparam test function
`ATCtree`	: ATC tree with upper bound of the DFS (without the root)
`csv_filename`	: Name of the output file, "solutions.csv" by default

Value

No return value, should process the output of the genetic algorithm in files produced by hyperparam_test_genetic_algorithm and output a summary csv file. The csv file is outputed in current directory and named after the csv_filename variable (solutions.csv by default).

Examples


 data("ATC_Tree_UpperBound_2024")
 data("FAERS_myopathy")
 files = c('250e_700ind_0.2mr_0ne_2alpha.txt') # results of hyperparam_test_genetic_algorithm

 print_csv(input_filenames = files, observations = FAERS_myopathy,
          repetition = 5, ATCtree = ATC_Tree_UpperBound_2024)

data("ATC_Tree_UpperBound_2024")
 data("FAERS_myopathy")
 files = c('250e_700ind_0.2mr_0ne_2alpha.txt') # results of hyperparam_test_genetic_algorithm

 print_csv(input_filenames = files, observations = FAERS_myopathy,
          repetition = 5, ATCtree = ATC_Tree_UpperBound_2024)

Make a Quantile-Quantile diagram from the output of the MCMC algorithm (DistributionAproximation) and the algorithm that exhaustively calculates the distribution

Description

Make a Quantile-Quantile diagram from the output of the MCMC algorithm (DistributionAproximation) and the algorithm that exhaustively calculates the distribution

Usage

qq_plot_output(estimated, true, filtered = FALSE, color = "steelblue")
qq_plot_output(estimated, true, filtered = FALSE, color = "steelblue")

Arguments

`estimated`	Outputed object of DistributionApproximation function
`true`	Outputed object of either DistributionApproximation function or True distribution computation function
`filtered`	Make use of the classic distributuion estimation or of the filtred one (number of patient taking the cocktail > beta)
`color`	The color of the dashed line of the qq-plot

Value

no returned value, should plot the quantile-quantile plot of the estimated distribution (estimated) vs the true distribution (true).

Examples


data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

estimated_score_distribution = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy[1:100,], Smax =2)
            
true_score_distribution = trueDistributionSizeTwoCocktail(ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy[1:100,], beta = 4)

qq_plot_output(estimated = estimated_score_distribution,
                true = true_score_distribution)

data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

estimated_score_distribution = DistributionApproximation(epochs = 10,
            ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy[1:100,], Smax =2)
            
true_score_distribution = trueDistributionSizeTwoCocktail(ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy[1:100,], beta = 4)

qq_plot_output(estimated = estimated_score_distribution,
                true = true_score_distribution)

Function used to convert a string vector of drugs in form "drug1:drug2" to a vector of index of the ATC tree ex: c(ATC_index(drug1), ATC_index(drugs2))

Description

Function used to convert a string vector of drugs in form "drug1:drug2" to a vector of index of the ATC tree ex: c(ATC_index(drug1), ATC_index(drugs2))

Usage

string_list_to_int_cocktails(ATC_name, lines)
string_list_to_int_cocktails(ATC_name, lines)

Arguments

`ATC_name`	the ATC_name column of the ATC tree
`lines`	A string vector of drugs cocktail in the form "drug1:drug2:...:drug_n"

Value

An R List that can be used by other algorithms (e.g. clustering algorithm)

Examples


  data("ATC_Tree_UpperBound_2024")
  string_list = c('hmg coa reductase inhibitors:nervous system',
                  'metformin:prasugrel')
  string_list_to_int_cocktails(ATC_Tree_UpperBound_2024$Name,
                              string_list)

data("ATC_Tree_UpperBound_2024")
  string_list = c('hmg coa reductase inhibitors:nervous system',
                  'metformin:prasugrel')
  string_list_to_int_cocktails(ATC_Tree_UpperBound_2024$Name,
                              string_list)

The true distribution of the score among every single nodes of the ATC

Description

The true distribution of the score among every single nodes of the ATC

Usage

trueDistributionDrugs(
  ATCtree,
  observations,
  beta,
  max_score = 1000L,
  nbResults = 100L,
  num_thread = 1L
)
trueDistributionDrugs(
  ATCtree,
  observations,
  beta,
  max_score = 1000L,
  nbResults = 100L,
  num_thread = 1L
)

Arguments

`ATCtree`	: ATC tree with upper bound of the DFS (without the root)
`observations`	: observation of the AE based on the medications of each patients (a DataFrame containing the medication on the first column and the ADR (boolean) on the second) on which we want to compute the risk distribution
`beta`	: minimum number of person taking the cocktails in order to consider it in the beta score distribution
`max_score`	: maximum number the score can take. Score greater than this one would be added to the distribution as the value max_score. Default is 1000
`nbResults`	: Number of returned solution (Cocktail with the best oberved score during the run), 100 by default
`num_thread`	: Number of thread to run in parallel if openMP is available, 1 by default

Value

Return a List containing : - ScoreDistribution : the distribution of the score as an array with each cells representing the number of risks = (index-1)/ 10 - Filtered_score_distribution : Distribution containing score for cocktails taken by at least beta patients. - Outstanding_score : An array of the score greater than max_score, - Best_cocktails : the nbResults bests cocktails encountered during the run. - Best_cocktails_beta : the nbResults bests cocktails taken by at least beta patients encountered during the run. - Best_scores : Score corresponding to the Best_cocktails. - Best_scores_beta : Score corresponding to the Best_cocktails_beta.

Examples


data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

size_1_score_distribution = trueDistributionDrugs(ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy[1:100,], beta = 4)

data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

size_1_score_distribution = trueDistributionDrugs(ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy[1:100,], beta = 4)

The true distribution of the score among every size-two cocktails

Description

The true distribution of the score among every size-two cocktails

Usage

trueDistributionSizeTwoCocktail(
  ATCtree,
  observations,
  beta,
  max_score = 100L,
  nbResults = 100L,
  num_thread = 1L
)
trueDistributionSizeTwoCocktail(
  ATCtree,
  observations,
  beta,
  max_score = 100L,
  nbResults = 100L,
  num_thread = 1L
)

Arguments

`ATCtree`	: ATC tree with upper bound of the DFS (without the root)
`observations`	: observation of the AE based on the medications of each patients (a DataFrame containing the medication on the first column and the ADR (boolean) on the second) on which we want to compute the risk distribution
`beta`	: minimum number of person taking the cocktails in order to consider it in the beta score distribution
`max_score`	: maximum number the score can take. Score greater than this one would be added to the distribution as the value max_score. Default is 1000
`nbResults`	: Number of returned solution (Cocktail with the best oberved score during the run), 100 by default
`num_thread`	: Number of thread to run in parallel if openMP is available, 1 by default

Value

Examples


data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

size_2_score_distribution = trueDistributionSizeTwoCocktail(ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy[1:100,], beta = 4)

data("ATC_Tree_UpperBound_2024")
data("FAERS_myopathy")

size_2_score_distribution = trueDistributionSizeTwoCocktail(ATCtree = ATC_Tree_UpperBound_2024,
            observations = FAERS_myopathy[1:100,], beta = 4)

Package 'emcAdr'

Help Index

Evolutionary Version of the Metropolis-Hastings Algorithm

Description

Author(s)

See Also

ATC Tree Upper Bound 2024

Description

Usage

Format

Source

Convert ATC Code for each patients to the corresponding DFS number of the ATC tree

Description

Usage

Arguments

Value

Examples

Calculate the divergence between 2 distributions (the true Distribution and the learned one)

Description

Usage

Arguments

Value

Examples

Clustering of the solutions of the genetic algorithm using the hclust algorithm

Description

Usage

Arguments

Value

Examples

Function used to compute the Hypergeometric score on a list of cocktails

Description

Usage

Arguments

Value

Examples

Function used to compute the Relative Risk on a list of cocktails

Description

Usage

Arguments

Value

Examples

Function used in the reference article to compare diverse Disproportionality Analysis metrics

Description

Usage

Arguments

Value

Examples

Function used to convert your genetic algorithm results that are stored into a .csv file to a Data structure that can be used by the clustering algorithm

Description

Usage

Arguments

Value

Examples

The MCMC method that runs the random walk on a single cocktail in order to estimate the distribution of score among cocktails of size Smax.

Description

Usage

Arguments

Value

Examples

FAERS Myopathy Dataset

Description

Usage

Format

Source

Genetic algorithm, trying to reach riskiest cocktails (the ones which maximize the fitness function, Hypergeometric score in our case)

Description

Usage

Arguments

Value

Examples

Recover the square matrix of distance between cocktails where the index (i,j) of the matrix is the distance between cocktails i and j in an arbitrary cocktail list

Description

Usage

Arguments

Value

Examples

Recover the square matrix of distance between cocktails where the index (i,j) of the matrix is the distance between cocktails i and j in the genetic_results list.

Description

Usage

Arguments