Title: | Multi-Purpose Core Subset Selection |
---|---|
Description: | Core Hunter is a tool to sample diverse, representative subsets from large germplasm collections, with minimum redundancy. Such so-called core collections have applications in plant breeding and genetic resource management in general. Core Hunter can construct cores based on genetic marker data, phenotypic traits or precomputed distance matrices, optimizing one of many provided evaluation measures depending on the precise purpose of the core (e.g. high diversity, representativeness, or allelic richness). In addition, multiple measures can be simultaneously optimized as part of a weighted index to bring the different perspectives closer together. The Core Hunter library is implemented in Java 8 as an open source project (see <http://www.corehunter.org>). |
Authors: | Herman De Beukelaer [aut, cre], Guy Davenport [aut], Veerle Fack [ths] |
Maintainer: | Herman De Beukelaer <[email protected]> |
License: | MIT + file LICENSE |
Version: | 3.2.3 |
Built: | 2024-10-26 06:14:36 UTC |
Source: | CRAN |
The data may contain genotypes, phenotypes and/or a precomputed distance matrix. All provided data should describe the same individuals which is verified by comparing the item ids and names.
coreHunterData(genotypes, phenotypes, distances)
coreHunterData(genotypes, phenotypes, distances)
genotypes |
Genetic marker data ( |
phenotypes |
Phenotypic trait data ( |
distances |
Precomputed distance matrix ( |
Core Hunter data (chdata
) with elements
geno
Genotype data of class chgeno
if included.
pheno
Phenotype data of class chpheno
if included.
dist
Distance data of class chdist
if included.
size
Number of individuals in the dataset.
ids
Unique item identifiers.
names
Item names. Names of individuals to which no explicit name
has been assigned are equal to the unique ids
.
java
Java version of the data object.
Core Hunter data of class chdata
.
genotypes
, phenotypes
, distances
## Not run: geno.file <- system.file("extdata", "genotypes.csv", package = "corehunter") pheno.file <- system.file("extdata", "phenotypes.csv", package = "corehunter") dist.file <- system.file("extdata", "distances.csv", package = "corehunter") my.data <- coreHunterData( genotypes(file = geno.file, format = "default"), phenotypes(file = pheno.file), distances(file = dist.file) ) ## End(Not run)
## Not run: geno.file <- system.file("extdata", "genotypes.csv", package = "corehunter") pheno.file <- system.file("extdata", "phenotypes.csv", package = "corehunter") dist.file <- system.file("extdata", "distances.csv", package = "corehunter") my.data <- coreHunterData( genotypes(file = geno.file, format = "default"), phenotypes(file = pheno.file), distances(file = dist.file) ) ## End(Not run)
Specify either a symmetric distance matrix or the file from which to read the matrix. See https://www.corehunter.org for documentation and examples of the distance matrix file format used by Core Hunter.
distances(data, file)
distances(data, file)
data |
Symmetric distance matrix. Unique row and column headers are required,
should be the same and are used as item ids. Can be a |
file |
File from which to read the distance matrix. |
Distance matrix data of class chdist
with elements
data
Distance matrix (numeric
matrix).
size
Number of individuals in the dataset.
ids
Unique item identifiers.
names
Item names. Names of individuals to which no explicit name
has been assigned are equal to the unique ids
.
java
Java version of the data object.
file
Normalized path of file from which data was read (if applicable).
# create from distance matrix m <- matrix(runif(100), nrow = 10, ncol = 10) diag(m) <- 0 # make symmetric m[lower.tri(m)] <- t(m)[lower.tri(m)] # set headers rownames(m) <- colnames(m) <- paste("i", 1:10, sep = "-") dist <- distances(m) # read from file dist.file <- system.file("extdata", "distances.csv", package = "corehunter") dist <- distances(file = dist.file)
# create from distance matrix m <- matrix(runif(100), nrow = 10, ncol = 10) diag(m) <- 0 # make symmetric m[lower.tri(m)] <- t(m)[lower.tri(m)] # set headers rownames(m) <- colnames(m) <- paste("i", 1:10, sep = "-") dist <- distances(m) # read from file dist.file <- system.file("extdata", "distances.csv", package = "corehunter") dist <- distances(file = dist.file)
Evaluate a core collection using the specified objective.
evaluateCore(core, data, objective)
evaluateCore(core, data, objective)
core |
A core collection of class |
data |
Core Hunter data ( |
objective |
Objective function ( |
Value of the core when evaluated with the chosen objective (numeric).
data <- exampleData() core <- sampleCore(data, objective("EN", "PD")) evaluateCore(core, data, objective("EN", "PD")) evaluateCore(core, data, objective("AN", "MR")) evaluateCore(core, data, objective("EE", "GD")) evaluateCore(core, data, objective("CV")) evaluateCore(core, data, objective("HE"))
data <- exampleData() core <- sampleCore(data, objective("EN", "PD")) evaluateCore(core, data, objective("EN", "PD")) evaluateCore(core, data, objective("AN", "MR")) evaluateCore(core, data, objective("EE", "GD")) evaluateCore(core, data, objective("CV")) evaluateCore(core, data, objective("HE"))
Data was genotyped using 190 SNP markers and 4 quantitative traits were recorded.
Includes a precomputed distance matrix read from "extdata/distances.csv"
,
genotypes read from "extdata/genotypes-biparental.csv"
and phenotypes read
from "extdata/phenotypes.csv"
.
The distance matrix is computed from the genotypes (Modified Rogers' distance).
exampleData()
exampleData()
Data was taken from the CIMMYT Research Data Repository (Study Global ID hdl:11529/10199; real data set 5, cycle 0).
Core Hunter data of class chdata
Cerón-Rojas, J. Jesús ; Crossa, José; Arief, Vivi N.; Kaye Basford; Rutkoski, Jessica; Jarquín, Diego ; Alvarado, Gregorio; Beyene, Yoseph; Semagn, Kassa ; DeLacy, Ian, 2015-06-04, "Application of a Genomics Selection Index to Real and Simulated Data", http://hdl.handle.net/11529/10199 V10
exampleData()
exampleData()
Specify either a data frame or matrix, or a file from which to read the genotypes. See https://www.corehunter.org for documentation and examples of the genotype data file format used by Core Hunter.
genotypes(data, alleles, file, format)
genotypes(data, alleles, file, format)
data |
Data frame or matrix containing the genotypes (individuals x markers) depending on the chosen format:
In case a data frame is provided, an optional first column |
alleles |
Allele names per marker ( |
file |
File containing the genotype data. |
format |
Genotype data format, one of |
Genotype data of class chgeno
with elements
data
Genotypes. Data frame for default format, numeric
matrix for other formats.
size
Number of individuals in the dataset.
ids
Unique item identifiers (character
).
names
Item names (character
). Names of individuals to which no explicit name
has been assigned are equal to the unique ids
.
markers
Marker names (character
).
May contain NA
values in case only some or no marker names were specified.
Marker names are always included for the default
and frequency
format
but are optional for the biparental
format.
alleles
List of character vectors with allele names per marker.
Vectors may contain NA
values in case only some or no allele names were
specified. For biparental
data the two alleles are name "0"
and
"1"
, respectively, for all markers. For the default
format allele
names are inferred from the provided data. Finally, for frequency
data
allele names are optional and may be specified either in the file or through
the alleles
argument when creating this type of data from a matrix or
data frame.
java
Java version of the data object.
format
Genotype data format used.
file
Normalized path of file from which data was read (if applicable).
## Not run: # create from data frame or matrix # default format geno.data <- data.frame( NAME = c("Alice", "Bob", "Carol", "Dave", "Eve"), M1.1 = c(1,2,1,2,1), M1.2 = c(3,2,2,3,1), M2.1 = c("B","C","D","B",NA), M2.2 = c("B","A","D","B",NA), M3.1 = c("a1","a1","a2","a2","a1"), M3.2 = c("a1","a2","a2","a1","a1"), M4.1 = c(NA,"+","+","+","-"), M4.2 = c(NA,"-","+","-","-"), row.names = paste("g", 1:5, sep = "-") ) geno <- genotypes(geno.data, format = "default") # biparental (e.g. SNP) geno.data <- matrix( sample(c(0,1,2), replace = TRUE, size = 1000), nrow = 10, ncol = 100 ) rownames(geno.data) <- paste("g", 1:10, sep = "-") colnames(geno.data) <- paste("m", 1:100, sep = "-") geno <- genotypes(geno.data, format = "biparental") # frequencies geno.data <- matrix( c(0.0, 0.3, 0.7, 0.5, 0.5, 0.0, 1.0, 0.4, 0.0, 0.6, 0.1, 0.9, 0.0, 1.0, 0.3, 0.3, 0.4, 1.0, 0.0, 0.6, 0.4), byrow = TRUE, nrow = 3, ncol = 7 ) rownames(geno.data) <- paste("g", 1:3, sep = "-") colnames(geno.data) <- c("M1", "M1", "M1", "M2", "M2", "M3", "M3") alleles <- c("M1-a", "M1-b", "M1-c", "M2-a", "M2-b", "M3-a", "M3-b") geno <- genotypes(geno.data, alleles, format = "frequency") # read from file # default format geno.file <- system.file("extdata", "genotypes.csv", package = "corehunter") geno <- genotypes(file = geno.file, format = "default") # biparental (e.g. SNP) geno.file <- system.file("extdata", "genotypes-biparental.csv", package = "corehunter") geno <- genotypes(file = geno.file, format = "biparental") # frequencies geno.file <- system.file("extdata", "genotypes-frequency.csv", package = "corehunter") geno <- genotypes(file = geno.file, format = "frequency") ## End(Not run)
## Not run: # create from data frame or matrix # default format geno.data <- data.frame( NAME = c("Alice", "Bob", "Carol", "Dave", "Eve"), M1.1 = c(1,2,1,2,1), M1.2 = c(3,2,2,3,1), M2.1 = c("B","C","D","B",NA), M2.2 = c("B","A","D","B",NA), M3.1 = c("a1","a1","a2","a2","a1"), M3.2 = c("a1","a2","a2","a1","a1"), M4.1 = c(NA,"+","+","+","-"), M4.2 = c(NA,"-","+","-","-"), row.names = paste("g", 1:5, sep = "-") ) geno <- genotypes(geno.data, format = "default") # biparental (e.g. SNP) geno.data <- matrix( sample(c(0,1,2), replace = TRUE, size = 1000), nrow = 10, ncol = 100 ) rownames(geno.data) <- paste("g", 1:10, sep = "-") colnames(geno.data) <- paste("m", 1:100, sep = "-") geno <- genotypes(geno.data, format = "biparental") # frequencies geno.data <- matrix( c(0.0, 0.3, 0.7, 0.5, 0.5, 0.0, 1.0, 0.4, 0.0, 0.6, 0.1, 0.9, 0.0, 1.0, 0.3, 0.3, 0.4, 1.0, 0.0, 0.6, 0.4), byrow = TRUE, nrow = 3, ncol = 7 ) rownames(geno.data) <- paste("g", 1:3, sep = "-") colnames(geno.data) <- c("M1", "M1", "M1", "M2", "M2", "M3", "M3") alleles <- c("M1-a", "M1-b", "M1-c", "M2-a", "M2-b", "M3-a", "M3-b") geno <- genotypes(geno.data, alleles, format = "frequency") # read from file # default format geno.file <- system.file("extdata", "genotypes.csv", package = "corehunter") geno <- genotypes(file = geno.file, format = "default") # biparental (e.g. SNP) geno.file <- system.file("extdata", "genotypes-biparental.csv", package = "corehunter") geno <- genotypes(file = geno.file, format = "biparental") # frequencies geno.file <- system.file("extdata", "genotypes-frequency.csv", package = "corehunter") geno <- genotypes(file = geno.file, format = "frequency") ## End(Not run)
Get Allele frequency matrix.
getAlleleFrequencies(data)
getAlleleFrequencies(data)
data |
Core Hunter data containing genotypes |
allele frequency matrix
Executes an independent stochastic hill-climbing search (random descent) per objective to approximate the optimal solution for each objective, from which a suitable normalization range is inferred based on the Pareto minima/maxima. These normalization searches are executed in parallel.
getNormalizationRanges( data, obj, size = 0.2, always.selected = integer(0), never.selected = integer(0), mode = c("default", "fast"), time = NA, impr.time = NA, steps = NA, impr.steps = NA )
getNormalizationRanges( data, obj, size = 0.2, always.selected = integer(0), never.selected = integer(0), mode = c("default", "fast"), time = NA, impr.time = NA, steps = NA, impr.steps = NA )
data |
Core Hunter data ( |
obj |
List of objectives ( |
size |
Desired core subset size (numeric). If larger than one the value is used as the absolute core size after rounding. Else it is used as the sampling rate and multiplied with the dataset size to determine the size of the core. The default sampling rate is 0.2. |
always.selected |
vector with indices (integer) or ids (character) of items that should always be selected in the core collection |
never.selected |
vector with indices (integer) or ids (character) of items that should never be selected in the core collection |
mode |
Execution mode ( |
time |
Absolute runtime limit in seconds. Not used by default ( |
impr.time |
Maximum time without improvement in seconds. If no explicit
stop conditions are specified, the maximum time without improvement defaults
to ten or two seconds, when executing Core Hunter in |
steps |
Maximum number of search steps. Not used by default ( |
impr.steps |
Maximum number of steps without improvement. Not used by
default ( |
For an objective that is being maximized, the upper bound is set to the value of the best solution for that objective, while the lower bound is set to the Pareto minimum, i.e. the minimum value obtained when evaluating all optimal solutions (for each single objective) with the considered objective. For an objective that is being minimized, the roles of upper and lower bound are interchanged, and the Pareto maximum is used instead.
Because Core Hunter uses stochastic algorithms, repeated runs may produce different
results. To eliminate randomness, you may set a random number generation seed using
set.seed
prior to executing Core Hunter. In addition, when reproducible
results are desired, it is advised to use step-based stop conditions instead of the
(default) time-based criteria, because runtimes may be affected by external factors,
and, therefore, a different number of steps may have been performed in repeated runs
when using time-based stop conditions.
Numeric matrix with one row per objective and two columns:
lower
Lower bound of normalization range.
upper
Upper bound of normalization range.
data <- exampleData() # maximize entry-to-nearest-entry distance between genotypes and phenotypes (equal weight) objectives <- list(objective("EN", "MR"), objective("EN", "GD")) # get normalization ranges for default size (20%) ranges <- getNormalizationRanges(data, obj = objectives, mode = "fast") # set normalization ranges and sample core objectives <- lapply(1:2, function(o){setRange(objectives[[o]], ranges[o,])}) core <- sampleCore(data, obj = objectives)
data <- exampleData() # maximize entry-to-nearest-entry distance between genotypes and phenotypes (equal weight) objectives <- list(objective("EN", "MR"), objective("EN", "GD")) # get normalization ranges for default size (20%) ranges <- getNormalizationRanges(data, obj = objectives, mode = "fast") # set normalization ranges and sample core objectives <- lapply(1:2, function(o){setRange(objectives[[o]], ranges[o,])}) core <- sampleCore(data, obj = objectives)
The following optimization objectives are supported by Core Hunter:
EN
Average entry-to-nearest-entry distance (default). Maximizes the average distance between each selected individual and the closest other selected item in the core. Favors diverse cores in which each individual is sufficiently different from the most similar other selected item (low redundancy). Multiple distance measures are provided to be used with this objective (see below).
AN
Average accession-to-nearest-entry distance. Minimizes the average distance between each individual (from the full dataset) and the closest selected item in the core (which can be the individual itself). Favors representative cores in which all items from the original dataset are represented by similar individuals in the selected subset. Multiple distance measures are provided to be used with this objective (see below).
EE
Average entry-to-entry distance. Maximizes the average distance between
each pair of selected individuals in the core. This objective is related to
the entry-to-nearest-entry (EN) distance but less effectively avoids redundant,
similar individuals in the core. In general, use of EN
is preferred.
Multiple distance measures are provided to be used with this objective (see below).
SH
Shannon's allelic diversity index. Maximizes the entropy, as used in information theory, of the selected core. Independently takes into account all allele frequencies, regardless of the locus (marker) where to which the allele belongs. Requires genotypes.
HE
Expected proportion of heterozygous loci. Maximizes the expected proportion of heterozygous
loci in offspring produced from random crossings within the selected core. In contrast to
Shannon's index (SH
) this objective treats each marker (locus) with equal importance,
regardless of the number of possible alleles for that marker. Requires genotypes.
CV
Allele coverage. Maximizes the proportion of alleles observed in the full dataset that are retained in the selected core. Requires genotypes.
The first three objective types (EN
, AN
and EE
) aggregate pairwise distances
between individuals. These distances can be computed using various measures:
MR
Modified Rogers distance (default). Requires genotypes.
CE
Cavalli-Sforza and Edwards distance. Requires genotypes.
GD
Gower distance. Requires phenotypes.
PD
Precomputed distances. Uses the precomputed distance matrix of the dataset.
objective( type = c("EN", "AN", "EE", "SH", "HE", "CV"), measure = c("MR", "CE", "GD", "PD"), weight = 1, range = NULL )
objective( type = c("EN", "AN", "EE", "SH", "HE", "CV"), measure = c("MR", "CE", "GD", "PD"), weight = 1, range = NULL )
type |
Objective type, one of |
measure |
Distance measure used to compute the distance between two
individuals, one of |
weight |
Weight assigned to the objective when maximizing a weighted index. Defaults to 1.0. |
range |
Normalization range [l,u] of the objective when maximizing a weighted
index. By default the range is not set ( |
Core Hunter objective of class chobj
with elements
type
Objective type.
meas
Distance measure (if applicable).
weight
Assigned weight.
range
Normalization range (if specified).
getNormalizationRanges
, setRange
objective() objective(meas = "PD") objective("EE", "GD") objective("HE") objective("EN", "MR", range = c(0.150, 0.300)) objective("AN", "MR", weight = 0.5, range = c(0.150, 0.300))
objective() objective(meas = "PD") objective("EE", "GD") objective("HE") objective("EN", "MR", range = c(0.150, 0.300)) objective("AN", "MR", weight = 0.5, range = c(0.150, 0.300))
Specify either a data frame containing the phenotypic trait observations or a file from which to read the data. See https://www.corehunter.org for documentation and examples of the phenotype data format used by Core Hunter.
phenotypes(data, types, min, max, file)
phenotypes(data, types, min, max, file)
data |
Data frame containing one row per individual and one column per trait.
Unique row and column names are required and used as item and trait ids, respectively.
The data frame may optionally include a first column |
types |
Variable types (optional). Vector of characters, each of length one or two. Ignored when reading from file. The first letter indicates the scale type and should be one of The second letter optionally indicates the variable encoding (in Java) and should
be one of If no explicit variable types are specified these are automatically inferred from
the data frame column types and classes, whenever possible. Columns of type
Boolean encoded nominals ( Ordinal variables of class If explicit types are given for some variables others can still be automatically inferred
by setting their type to |
min |
Minimum values of interval or ratio variables (optional).
Numeric vector. Ignored when reading from file.
If undefined for some variables the respective minimum is inferred from the data.
If the data exceeds the minimum it is also updated accordingly.
For nominal and ordinal variables just put |
max |
Maximum values of interval or ratio variables (optional).
Numeric vector. Ignored when reading from file.
If undefined for some variables the respective maximum is inferred from the data.
If the data exceeds the maximum it is also updated accordingly.
For nominal and ordinal variables just put |
file |
File containing the phenotype data. |
Phenotype data of class chpheno
with elements
data
Phenotypes (data frame).
size
Number of individuals in the dataset.
ids
Unique item identifiers.
names
Item names. Names of individuals to which no explicit name
has been assigned are equal to the unique ids
.
types
Variable types and encodings.
ranges
Variable ranges, when applicable (NA
elsewhere).
java
Java version of the data object.
file
Normalized path of file from which the data was read (if applicable).
# create from data frame pheno.data <- data.frame( season = c("winter", "summer", "summer", "winter", "summer"), yield = c(34.5, 32.6, 22.1, 54.12, 43.33), size = ordered(c("l", "s", "s", "m", "l"), levels = c("s", "m", "l")), resistant = c(FALSE, TRUE, TRUE, FALSE, TRUE) ) pheno <- phenotypes(pheno.data) # explicit types pheno <- phenotypes(pheno.data, types = c("N", "R", "O", "NB")) # treat last column as symmetric binary, auto infer others pheno <- phenotypes(pheno.data, types = c(NA, NA, NA, "NS")) # explicit ranges pheno <- phenotypes(pheno.data, min = c(NA, 20.0, NA, NA), max = c(NA, 60.0, NA, NA)) # read from file pheno.file <- system.file("extdata", "phenotypes.csv", package = "corehunter") pheno <- phenotypes(file = pheno.file)
# create from data frame pheno.data <- data.frame( season = c("winter", "summer", "summer", "winter", "summer"), yield = c(34.5, 32.6, 22.1, 54.12, 43.33), size = ordered(c("l", "s", "s", "m", "l"), levels = c("s", "m", "l")), resistant = c(FALSE, TRUE, TRUE, FALSE, TRUE) ) pheno <- phenotypes(pheno.data) # explicit types pheno <- phenotypes(pheno.data, types = c("N", "R", "O", "NB")) # treat last column as symmetric binary, auto infer others pheno <- phenotypes(pheno.data, types = c(NA, NA, NA, "NS")) # explicit ranges pheno <- phenotypes(pheno.data, min = c(NA, 20.0, NA, NA), max = c(NA, 60.0, NA, NA)) # read from file pheno.file <- system.file("extdata", "phenotypes.csv", package = "corehunter") pheno <- phenotypes(file = pheno.file)
Delegates to read.delim
where the separator is inferred from the file extension (CSV or TXT).
For CSV files the delimiter is set to ","
while for TXT file "\t"
is used. Also sets
some default argument values as used by Core Hunter.
read.autodelim( file, quote = "'\"", row.names = 1, na.strings = "", check.names = FALSE, strip.white = TRUE, stringsAsFactors = FALSE, ... )
read.autodelim( file, quote = "'\"", row.names = 1, na.strings = "", check.names = FALSE, strip.white = TRUE, stringsAsFactors = FALSE, ... )
file |
File path. |
quote |
the set of quoting characters. To disable quoting
altogether, use |
row.names |
a vector of row names. This can be a vector giving the actual row names, or a single number giving the column of the table which contains the row names, or character string giving the name of the table column containing the row names. If there is a header and the first row contains one fewer field than
the number of columns, the first column in the input is used for the
row names. Otherwise if Using |
na.strings |
a character vector of strings which are to be
interpreted as |
check.names |
logical. If |
strip.white |
logical. Used only when |
stringsAsFactors |
logical: should character vectors be converted
to factors? Note that this is overridden by |
... |
Further arguments to be passed to |
Data frame.
Sample a core collection from the given data.
sampleCore( data, obj, size = 0.2, always.selected = integer(0), never.selected = integer(0), mode = c("default", "fast"), normalize = TRUE, time = NA, impr.time = NA, steps = NA, impr.steps = NA, indices = FALSE, verbose = FALSE )
sampleCore( data, obj, size = 0.2, always.selected = integer(0), never.selected = integer(0), mode = c("default", "fast"), normalize = TRUE, time = NA, impr.time = NA, steps = NA, impr.steps = NA, indices = FALSE, verbose = FALSE )
data |
Core Hunter data ( |
obj |
Objective or list of objectives ( |
size |
Desired core subset size (numeric). If larger than one the value is used as the absolute core size after rounding. Else it is used as the sampling rate and multiplied with the dataset size to determine the size of the core. The default sampling rate is 0.2. |
always.selected |
vector with indices (integer) or ids (character) of items that should always be selected in the core collection |
never.selected |
vector with indices (integer) or ids (character) of items that should never be selected in the core collection |
mode |
Execution mode ( |
normalize |
If Normalization requires an independent preliminary search per objective (fast stochastic
hill-climber, executed in parallel for all objectives). The same stop conditions, as
specified for the main search, are also applied to each normalization search. In
Normalization ranges can also be precomputed (see |
time |
Absolute runtime limit in seconds. Not used by default ( |
impr.time |
Maximum time without improvement in seconds. If no explicit
stop conditions are specified, the maximum time without improvement defaults
to ten or two seconds, when executing Core Hunter in |
steps |
Maximum number of search steps. Not used by default ( |
impr.steps |
Maximum number of steps without improvement. Not used by
default ( |
indices |
If |
verbose |
If |
Because Core Hunter uses stochastic algorithms, repeated runs may produce different
results. To eliminate randomness, you may set a random number generation seed using
set.seed
prior to executing Core Hunter. In addition, when reproducible
results are desired, it is advised to use step-based stop conditions instead of the
(default) time-based criteria, because runtimes may be affected by external factors,
and, therefore, a different number of steps may have been performed in repeated runs
when using time-based stop conditions.
Core subset (chcore
). It has an element sel
which is a character or numeric vector containing the sorted ids or indices,
respectively, of the selected individuals (see argument indices
).
In addition the result has one or more elements that indicate the value
of each objective function that was included in the optimization.
coreHunterData
, objective
, getNormalizationRanges
data <- exampleData() # default size, maximize entry-to-nearest-entry Modified Rogers distance obj <- objective("EN", "MR") core <- sampleCore(data, obj) # fast mode core <- sampleCore(data, obj, mode = "f") # absolute size core <- sampleCore(data, obj, size = 25) # relative size core <- sampleCore(data, obj, size = 0.1) # other objective: minimize accession-to-nearest-entry precomputed distance core <- sampleCore(data, obj = objective(type = "AN", measure = "PD")) # multiple objectives (equal weight) core <- sampleCore(data, obj = list( objective("EN", "PD"), objective("AN", "GD") )) # multiple objectives (custom weight) core <- sampleCore(data, obj = list( objective("EN", "PD", weight = 0.3), objective("AN", "GD", weight = 0.7) )) # custom stop conditions core <- sampleCore(data, obj, time = 5, impr.time = 2) core <- sampleCore(data, obj, steps = 300) # print progress messages core <- sampleCore(data, obj, verbose = TRUE)
data <- exampleData() # default size, maximize entry-to-nearest-entry Modified Rogers distance obj <- objective("EN", "MR") core <- sampleCore(data, obj) # fast mode core <- sampleCore(data, obj, mode = "f") # absolute size core <- sampleCore(data, obj, size = 25) # relative size core <- sampleCore(data, obj, size = 0.1) # other objective: minimize accession-to-nearest-entry precomputed distance core <- sampleCore(data, obj = objective(type = "AN", measure = "PD")) # multiple objectives (equal weight) core <- sampleCore(data, obj = list( objective("EN", "PD"), objective("AN", "GD") )) # multiple objectives (custom weight) core <- sampleCore(data, obj = list( objective("EN", "PD", weight = 0.3), objective("AN", "GD", weight = 0.7) )) # custom stop conditions core <- sampleCore(data, obj, time = 5, impr.time = 2) core <- sampleCore(data, obj, steps = 300) # print progress messages core <- sampleCore(data, obj, verbose = TRUE)
See argument range
of objective
for details.
setRange(obj, range)
setRange(obj, range)
obj |
Core Hunter objective of class |
range |
Normalization range [l,u].
See argument |
Objective including normalization range.
If the given data does not match any of these three classes it is returned unchanged.
wrapData(data)
wrapData(data)
data |
of class |
Core Hunter data of class chdata