Title: | Burns Statistics Miscellaneous |
---|---|
Description: | Script search, corner, genetic optimization, permutation tests, write expect test. |
Authors: | Pat Burns |
Maintainer: | Pat Burns <[email protected]> |
License: | Unlimited |
Version: | 1.1 |
Built: | 2024-12-23 06:42:09 UTC |
Source: | CRAN |
Writes either the C declaration of items in a list, or initializes them with the contents of the list.
Cfrag.list(x, file = NULL, item.num = c(3, 10, 5), indent = c("\t", "\t\t"), declaration.only = FALSE, long = FALSE, append = FALSE)
Cfrag.list(x, file = NULL, item.num = c(3, 10, 5), indent = c("\t", "\t\t"), declaration.only = FALSE, long = FALSE, append = FALSE)
x |
required. A list with names. |
file |
either |
item.num |
length three vector giving the number of items per line for doubles, integers and characters. |
indent |
length two vector giving the amount to indent declarations and the items in the initialization. |
declaration.only |
logical flag.
If |
long |
logical flag.
If |
append |
logical flag.
If |
if file
is a non-empty character string, then the name of the file
that is written.
Otherwise, a character vector of the declarations – each element representing
a different line.
if file
is a non-empty character string, then the file is created,
overwritten or appended.
The type to declare is dependent on the storage mode of the component of
x
.
You may need to coerce components to get them to be declared the correct type.
Rcpp now probably makes most uses of this function obsolete.
.C
, storage.mode
, as.double
,
as.integer
, as.character
, cat
.
test.list <- list(adoub=as.double(rnorm(20)), anint=as.integer(92:109), achar=c("aaa", "bbbb", "ccccc")) Cfrag.list(test.list, file="") ## Not run: Cfrag.list(test.list, file="test.c") Cfrag.list(test.list[1], file="test.c", dec=TRUE) Cfrag.list(test.list[-1], file="test.c", dec=FALSE, append=TRUE) ## End(Not run)
test.list <- list(adoub=as.double(rnorm(20)), anint=as.integer(92:109), achar=c("aaa", "bbbb", "ccccc")) Cfrag.list(test.list, file="") ## Not run: Cfrag.list(test.list, file="test.c") Cfrag.list(test.list[1], file="test.c", dec=TRUE) Cfrag.list(test.list[-1], file="test.c", dec=FALSE, append=TRUE) ## End(Not run)
Returns an array (or data frame) of the same number of dimensions as the input, but generally with smaller dimensions.
corner(x, corner = "tlffff", n = 6)
corner(x, corner = "tlffff", n = 6)
x |
an array (which includes matrices) or data frame. |
corner |
a single character string which codes the particular corner of the array that is desired. The first character is either "t" (top) or "b" (bottom). The second character is either "l" (left) or "r" (right). Remaining characters should be either "f" (front) or "b" (back). |
n |
a vector of positive integers which generally should have
length at least that of the |
an array similar to the input x
, but with smaller dimensions in general.
corner(freeny.x) corner(freeny.x, "br", c(2,3)) # with three-dimensional arrays: corner(iris3) corner(array(1:1000, rep(10, 3)), "brf", 2:4)
corner(freeny.x) corner(freeny.x, "br", c(2,3)) # with three-dimensional arrays: corner(iris3) corner(array(1:1000, rep(10, 3)), "brf", 2:4)
Approximately minimizes the value of a function using a simple heuristic optimizer that uses a combination of genetic and simulated annealing optimization.
genopt(fun, population, lower = -Inf, upper = Inf, scale = dcontrol["eps"], add.args = NULL, control = genopt.control(...), ...)
genopt(fun, population, lower = -Inf, upper = Inf, scale = dcontrol["eps"], add.args = NULL, control = genopt.control(...), ...)
fun |
function that returns a numerical value. |
population |
a matrix or a list. If a matrix, then the rows correspond to the parameters and the columns are different parameter vectors. If a list, then it should have a component named "population" and
optionally a component named "objective" – in particular, it can
be the result of a call to |
lower |
vector giving the lower bound for parameter values. This is replicated to be as long as the number of parameters. |
upper |
vector giving the upper bound for parameter values. This is replicated to be as long as the number of parameters. |
scale |
vector of scales to use when doing local search with a solution. This is replicated to be as long as the number of parameters. |
add.args |
list of additional arguments to |
control |
an object like the output of |
... |
arguments for |
There is a summary
method for class genopt
which
shows the call, a summary of the set of objectives found, and the
best solution (set of parameters).
a list of class genopt
with the following components:
population |
a matrix of the same size as the input population matrix, but generally with different values in it. |
objective |
a vector with length equal to the number of columns of |
funevals |
the number of function evaluations performed.
If |
random.seed |
the random seed at the start of the call – given so that you can reproduce the computations. |
call |
an image of the call that created this object. |
The original version of this function appeared in "S Poetry".
genopt.control
, summary.genopt
.
# two parameters, population size 5 go1 <- genopt(function(x, other) sum(x, other), population=matrix(rexp(10), nrow=2, ncol=5), lower=0, add.arg=list(other=3), trace=FALSE) summary(go1) go2 <- genopt(function(x, other) sum(x, other), population=go1, lower=0, add.arg=list(other=3), trace=FALSE)
# two parameters, population size 5 go1 <- genopt(function(x, other) sum(x, other), population=matrix(rexp(10), nrow=2, ncol=5), lower=0, add.arg=list(other=3), trace=FALSE) summary(go1) go2 <- genopt(function(x, other) sum(x, other), population=go1, lower=0, add.arg=list(other=3), trace=FALSE)
Returns a list suitable as the control
argument
of the genopt
function.
genopt.control(births = 100, random.n = 0, jitters.n = 3, trace = TRUE, eps = 0.1, prob = 0.4, scale.min = 1e-12, maxeval = Inf)
genopt.control(births = 100, random.n = 0, jitters.n = 3, trace = TRUE, eps = 0.1, prob = 0.4, scale.min = 1e-12, maxeval = Inf)
births |
the number of times two parents are combined to create a new solution. |
random.n |
the number of random solutions created (and evaluated) before the genetic phase starts. |
jitters.n |
the number of times a newly successful solution is changed (in the simulated annealing phase) in an attempt to find a nearby better one. |
trace |
logical value: should progress be printed? |
eps |
the default value for the scale of the jittering in simulated annealing. |
prob |
in the selection of parameters between two parents, the probability of each coming from the first parent. |
scale.min |
the minimum value allowed in the |
maxeval |
the maximum number of function evaluations allowed.
This takes previous function calls into account if |
a list with components:
icontrol |
vector of the control parameters that are logically integer (or logical). |
dcontrol |
vector of the control parameters that are logically real-valued. |
goc1 <- genopt.control(random.n=200, births=1000, trace=FALSE) go1 <- genopt(function(x, other) sum(x, other), matrix(rexp(10), nrow=2, ncol=5), lower=0, add.arg=list(other=3), control=goc1) goc2 <- genopt.control(random.n=0, births=1000, trace=FALSE) go2 <- genopt(function(x, other) sum(x, other), go1, lower=0, add.arg=list(other=3), control=goc2)
goc1 <- genopt.control(random.n=200, births=1000, trace=FALSE) go1 <- genopt(function(x, other) sum(x, other), matrix(rexp(10), nrow=2, ncol=5), lower=0, add.arg=list(other=3), control=goc1) goc2 <- genopt.control(random.n=0, births=1000, trace=FALSE) go2 <- genopt(function(x, other) sum(x, other), go1, lower=0, add.arg=list(other=3), control=goc2)
Creates groups where the groups each have as close to the same number of members as possible.
ntile(x, ngroups, na.rm = FALSE, result = "list", reverse = FALSE, checkBleed = TRUE)
ntile(x, ngroups, na.rm = FALSE, result = "list", reverse = FALSE, checkBleed = TRUE)
x |
a numeric vector. |
ngroups |
a positive integer giving the number of groups to create. |
na.rm |
logical value: if |
result |
a character string specifying the form of the resulting value.
This must be (an abbreviation of) one of: |
reverse |
logical value: if |
checkBleed |
logical value: if |
the form of the value depends on the result
argument.
The "list"
result has the values of the input x
grouped
into ngroups
components.
The "numeric"
result is a vector of integers from 1
to ngroups
indicating which group the corresponding element
of x
is in.
The "factor"
result is an ordered factor version of the
"numeric"
result.
A more primitive version of this function appeared in a blog post called "Miles of iles" on the Portfolio Probe website. A bit of comparison with the alternative functions in See Also may be found there.
Pat Burns
ntile(setNames(state.area, state.name), 10) ntile(Loblolly$height, 5, result="factor", reverse=TRUE) ntile(c(-10:10, rep(0, 7)), 4)
ntile(setNames(state.area, state.name), 10) ntile(Loblolly$height, 5, result="factor", reverse=TRUE) ntile(c(-10:10, rep(0, 7)), 4)
Performs a random permutation test on the relationship between two discrete variables, or by using a function.
permutation.test.discrete(x, y = NULL, scores, alternative = "greater", trials = 1000) permutation.test.fun(x, y = NULL, fun = function(x, y) sum(x * y), alternative = "greater", trials = 1000)
permutation.test.discrete(x, y = NULL, scores, alternative = "greater", trials = 1000) permutation.test.fun(x, y = NULL, fun = function(x, y) sum(x * y), alternative = "greater", trials = 1000)
x |
either a two-column matrix or data frame, or a vector.
When this has two columns, |
y |
either a zero-length object, or a vector the same length as |
scores |
a numeric matrix providing the scores for each combination of
the unique values in |
fun |
a function that takes two arguments and returns a single numeric value. |
alternative |
a character string that partially matches either |
trials |
the number of random permutations to be performed. |
an object of class permtstBurSt
which is a list with the
following components:
original.score |
the score (or function value) produced by the original data. |
perm.scores |
a vector of the scores (or function values) from the random permutations. |
stats |
a numeric vector that has the number of observations in the data, the number of random permutations done, the number of permutations that produced a score at least as extreme as the original, and the p-value for the test. |
alternative |
either |
random.seed |
the random seed at the start of the call. |
call |
an image of the call that was used. |
The object .Random.seed
is either created or updated.
There are print
and plot
methods for this class of object.
The print method merely describes the object and shows the p-value of the
test (rounded, by default, to 4 digits).
A simple version of the p-value is the number of random permutations that are at least as extreme as the original divided by the total number of random permutations. The value computed, which is more correct, has 1 added to both numerator and denominator. In general the difference is of no consequence. However, there are cases where it does matter, for example when independent p-values are combined.
These functions are related to "Permuting Super Bowl Theory" which can be found in the working papers section of http://www.burns-stat.com. The paper explains permutation tests via a discussion of the Super Bowl indicator of the stock market.
winner <- c('N', 'N', 'A', 'N', 'A', 'N') market <- c('+', '-', '-', '+', '+', '+') smat <- diag(2) dimnames(smat) <- list(c('N', 'A'), c('+', '-')) pt1 <- permutation.test.discrete(winner, market, smat) print(pt1) plot(pt1) pt2 <- permutation.test.fun(ToothGrowth[, -2], fun=cor) print(pt2) plot(pt2) smat2 <- matrix(c(-3, -.5, 3, -1, 1, 0, 0, 1, -1, 3, -.5, -3), 3, 4, dimnames=list(c('Up', 'Neut', 'Down'), c('Q1', 'Q2', 'Q3', 'Q4'))) my.results <- data.frame(results=sample(c('Up', 'Neut', 'Down'), 100, replace=TRUE), quartile=sample( c('Q1', 'Q2', 'Q3', 'Q4'), 100, replace=TRUE)) permutation.test.discrete(my.results[, c("results", "quartile")], score=smat2)
winner <- c('N', 'N', 'A', 'N', 'A', 'N') market <- c('+', '-', '-', '+', '+', '+') smat <- diag(2) dimnames(smat) <- list(c('N', 'A'), c('+', '-')) pt1 <- permutation.test.discrete(winner, market, smat) print(pt1) plot(pt1) pt2 <- permutation.test.fun(ToothGrowth[, -2], fun=cor) print(pt2) plot(pt2) smat2 <- matrix(c(-3, -.5, 3, -1, 1, 0, 0, 1, -1, 3, -.5, -3), 3, 4, dimnames=list(c('Up', 'Neut', 'Down'), c('Q1', 'Q2', 'Q3', 'Q4'))) my.results <- data.frame(results=sample(c('Up', 'Neut', 'Down'), 100, replace=TRUE), quartile=sample( c('Q1', 'Q2', 'Q3', 'Q4'), 100, replace=TRUE)) permutation.test.discrete(my.results[, c("results", "quartile")], score=smat2)
Shows the distribution of the scores of the random permutations relative to the score from the actual data.
## S3 method for class 'permtstBurSt' plot(x, col = c("black", "red"), width = 10, uniqlim = 10, main = "", xlab = "Scores", ...)
## S3 method for class 'permtstBurSt' plot(x, col = c("black", "red"), width = 10, uniqlim = 10, main = "", xlab = "Scores", ...)
x |
an object of class
|
col |
the colors to use in the plot – mainly for the bar-type plot when there are only a few scores in the permutation distribution. The first color is the one used for the insignificant part of the distribution, the second color is for the significant part of the distribution. The second color is used to identify the original score when a histogram is used. |
width |
the width of the bars. |
uniqlim |
the value that determines which type of plot to use.
If the number of unique values in the scores from the permutations
is bigger than |
main |
character string giving the main title for the plot. |
xlab |
character string giving the label for the x-axis. |
... |
additional graphics parameters may be given. |
a plot is produced.
Two styles of plot are possible.
A histogram is more appropriate when there are more than a few
values in the permutation distribution.
When there are only a few distinct values,
then a bar-type plot is more informative.
The uniqlim
argument controls the definition of "a few".
These functions are related to "Permuting Super Bowl Theory" which can be found in the working papers section of http://www.burns-stat.com. The paper explains permutation tests via a discussion of the Super Bowl indicator of the stock market.
## Not run: pt1 <- permutation.test.discrete(winner, market, smat) plot(pt1, col=c("blue", "gold")) ## End(Not run)
## Not run: pt1 <- permutation.test.discrete(winner, market, smat) plot(pt1, col=c("blue", "gold")) ## End(Not run)
Returns a list of text matches in the scripts in a directory.
scriptSearch(pattern, path = ".", subdirs = TRUE, suffix = "\\.[rR]$", commentsIncluded = FALSE, ..., verbose = FALSE)
scriptSearch(pattern, path = ".", subdirs = TRUE, suffix = "\\.[rR]$", commentsIncluded = FALSE, ..., verbose = FALSE)
pattern |
character string containing a regular expression
(this is the argument of the same name for |
path |
a character string giving the (relative) path of the directory holding the R scripts. |
subdirs |
logical value: if |
suffix |
a character string restricting the files to be searched. |
commentsIncluded |
logical value: if |
... |
additional arguments to |
verbose |
logical value: if |
The default suffix
argument restricts the search to R scripts.
An R script is considered to be a file that ends in dot-R, where the
"R" can be lowercase or uppercase.
a list: the names are equal to the file names containing matches, and each component is a character vector of the whitespace-trimmed lines in the file that match the pattern.
Using commentsIncluded=FALSE
does not preclude a match being
declared when it is in a comment not starting the line.
Pat Burns
## Not run: scriptSearch("list.files") scriptSearch("garch", "~/../burns-stat3/webpages/blog") # search for an assignment # ' *' means zero or more spaces scriptSearch("specialObject *<- ", "~/myScriptDirectory", subdirs=FALSE) # search COBOL files scriptSearch("blah", suffix="\\.cbl$") ## End(Not run)
## Not run: scriptSearch("list.files") scriptSearch("garch", "~/../burns-stat3/webpages/blog") # search for an assignment # ' *' means zero or more spaces scriptSearch("specialObject *<- ", "~/myScriptDirectory", subdirs=FALSE) # search COBOL files scriptSearch("blah", suffix="\\.cbl$") ## End(Not run)
The call, best solution and summary of objectives in the final population.
## S3 method for class 'genopt' summary(object, ...)
## S3 method for class 'genopt' summary(object, ...)
object |
the result of a call to |
... |
currently unused. |
a list with components:
call |
image of the call to |
summary.objectives |
summary of the objectives of the solutions in the final population. |
best.solution |
the vector of parameters with the best solution in the population. |
# two parameters, population size 5 go1 <- genopt(function(x, other) sum(x, other), population=matrix(rexp(10), nrow=2, ncol=5), lower=0, add.arg=list(other=3), trace=FALSE) summary(go1)
# two parameters, population size 5 go1 <- genopt(function(x, other) sum(x, other), population=matrix(rexp(10), nrow=2, ncol=5), lower=0, add.arg=list(other=3), trace=FALSE) summary(go1)
Automatically writes the text of a testthat::expect_equal
test of an expression.
writeExpectTest(expr, filename = "", ...)
writeExpectTest(expr, filename = "", ...)
expr |
an expression to be tested. |
filename |
where should the results go to? The empty string means that it goes to the console. |
... |
additional arguments to |
NULL
text is written to the console or to a file.
Some functions are hard to test, like scriptSearch
.
It seems like this one would be too, but in fact it is self-testing.
If the test it writes doesn't pass in the same environment, then it
must have done something wrong.
writeExpectTest(head(1:10)) writeExpectTest(head(cars))
writeExpectTest(head(1:10)) writeExpectTest(head(cars))