Title: | Calculate Relevance and Significance Measures |
---|---|
Description: | Calculates relevance and significance values for simple models and for many types of regression models. These are introduced in 'Stahel, Werner A.' (2021) "Measuring Significance and Relevance instead of p-values." <https://stat.ethz.ch/~stahel/relevance/stahel-relevance2103.pdf>. These notions are also applied to replication studies, as described in the manuscript 'Stahel, Werner A.' (2022) "'Replicability': Terminology, Measuring Success, and Strategy" available in the documentation. |
Authors: | Werner A. Stahel |
Maintainer: | Werner A. Stahel <[email protected]> |
License: | GPL-2 |
Version: | 2.1 |
Built: | 2024-10-28 07:01:00 UTC |
Source: | CRAN |
Calculates relevance and significance values for simple models and for many types of regression models. These are introduced in 'Stahel, Werner A.' (2021) "Measuring Significance and Relevance instead of p-values." <https://stat.ethz.ch/~stahel/relevance/stahel-relevance2103.pdf>. These notions are also applied to replication studies, as described in the manuscript 'Stahel, Werner A.' (2022) "'Replicability': Terminology, Measuring Success, and Strategy" available in the documentation.
The DESCRIPTION file:
Package: | relevance |
Type: | Package |
Title: | Calculate Relevance and Significance Measures |
Version: | 2.1 |
Date: | 2024-01-24 |
Author: | Werner A. Stahel |
Maintainer: | Werner A. Stahel <[email protected]> |
Depends: | R (>= 3.5.0) |
Imports: | stats, utils, graphics |
Suggests: | MASS, survival, knitr |
VignetteBuilder: | knitr |
Description: | Calculates relevance and significance values for simple models and for many types of regression models. These are introduced in 'Stahel, Werner A.' (2021) "Measuring Significance and Relevance instead of p-values." <https://stat.ethz.ch/~stahel/relevance/stahel-relevance2103.pdf>. These notions are also applied to replication studies, as described in the manuscript 'Stahel, Werner A.' (2022) "'Replicability': Terminology, Measuring Success, and Strategy" available in the documentation. |
License: | GPL-2 |
NeedsCompilation: | no |
Packaged: | 2024-01-25 16:36:07 UTC; stahel |
Repository: | CRAN |
Date/Publication: | 2024-01-25 17:00:02 UTC |
Index of help topics:
asinp arc sine Transformation confintF Confidence Interval for the Non-Central F and Chisquare Distribution correlation Correlation with Relevance and Significance Measures d.blast Blasting for a tunnel d.everest Data of an 'anchoring' experiment in psychology d.negposChoice Data of an 'anchoring' experiment in psychology d.osc15 Data from the OSC15 replication study d.osc15Onesample Data from the OSC15 replication study, one sample tests drop1Wald Drop Single Terms of a Model and Calculate Respective Wald Tests dropNA drop or replace NA values dropdata Drop Observations from a Data.frame formatNA Print NA values by a Desired Code getcoeftable Extract Components of a Fit inference Calculate Confidence Intervals and Relevance and Significance Values last Last Elements of a Vector or of a Matrix logst Started Logarithmic Transformation ovarian ovarian plconfint Plot Confidence Intervals plot.inference Plot Inference Results print.inference Print Tables with Inference Measures relevance-package Calculate Relevance and Significance Measures relevance.options Options for the relevnance Package replication Inference for Replication Studies rlvClass Relevance Class rplClass Reproducibility Class shortenstring Shorten Strings showd Show a Part of a Data.frame sumNA Count NAs termeffects All Coefficients of a Model Fit termtable Statistics for Linear Models, Including Relevance Statistics twosamples Relevance and Significance for One or Two Samples
Further information is available in the following vignettes:
relevance-descr |
'Calculate Relevance and Significance Measures' (source, pdf) |
Relevance is a measure that expresses the (scientific) relevance of an effect. The simplest case is a single sample of supposedly normally distributed observations, where interest lies in the expectation, estimated by the mean of the observations. There is a threshold for the expectation, below which an effect is judged too small to be of interest.
The estimated relevance ‘’ is then simply the estimated effect divided by
the threshold. If it is larger than 1, the effect is thus judged
relevant. The two other values that characterize the relevance are the
limits of the confidence interval for the true value of the relevance,
called the secured relevance ‘
’ and the potential relevance ‘
’.
If , then one might say that the effect is
“significantly relevant”.
Another useful measure, meant to replace the p-value, is the
“significance” ‘Sg0’. In the simple case, it divides the
estimated effect by the critical value of the (t-) test statistic.
Thus, the statistical test of the null hypothesis of zero expectation
is significant if ‘Sg0’ is larger than one, .
These measures are also calculated for the comparison of two groups, for proportions, and most importantly for regression models. For models with linear predictors, relevances are obtained for standardized coefficients as well as for the effect of dropping terms and the effect on prediction.
The most important functions are
twosamples()
: calculate the measures for two paired or unpaired sampless or a simple mean. This function calls
inference()
: calculates the confidence interval and siginificance based on an estimate and a standard error, and adds relevance for a standardized effect.
termtable()
: deals with fits of regression models with a linear predictor.
It calculates confidence intervals and significances for
the coefficients of terms with a single degree of freedom.
It includes the effect of dropping each term
(based on the drop1
function)
and the respective significance and relevance measures.
termeffects()
: calculates the relevances for the coefficients
related to each term. These differ from the enties of termtable
only for terms with more than one degree of freedom.
Werner A. Stahel
Maintainer: Werner A. Stahel <[email protected]>
Stahel, Werner A. (2021). New relevance and significance measures to replace p-values. To appear in PLoS ONE
Package regr, avaiable from https://regdevelop.r-forge.r-project.org
data(swiss) rr <- lm(Fertility ~ . , data = swiss) termtable(rr)
data(swiss) rr <- lm(Fertility ~ . , data = swiss) termtable(rr)
Calculates the sqrt arc sine of x/100, rescaled to be in the unit
interval.
This transformation is useful for analyzing percentages or proportions
of any kind.
asinp(x)
asinp(x)
x |
vector of data values |
vector of transformed values
This very simple function is provided in order to simplify
formulas. It has an attribute "inverse"
that contains
the inverse function, see example.
Werner A. Stahel, ETH Zurich
asinp(seq(0,100,10)) ( y <- asinp(c(1,50,90,95,99)) ) attr(asinp, "inverse")(y)
asinp(seq(0,100,10)) ( y <- asinp(c(1,50,90,95,99)) ) attr(asinp, "inverse")(y)
Confidence Interval for the Non-Central F and Chisquare Distribution
confintF(f, df1, df2, testlevel = 0.05)
confintF(f, df1, df2, testlevel = 0.05)
f |
observed F value(s) |
df1 |
degrees of freedom for the numerator of the F distribution |
df2 |
degrees of freedom for the denominator of the F distribution |
testlevel |
level of the (two-sided) test that determines the confidence interval, 1 - confidence level |
The confidence interval is calculated by solving the two implicit
equations qf(f, df1, df2, x) = testlevel/2
and
... = 1 - testlevel/2
.
For f>100
, the usual f +- standard error
interval is
used as a rather crude approximation.
A confidence interval for the non-centrality of the Chisquare
distribution is obtained by setting df2
to Inf
(the default) and f=x2/df1
if x2
is the observed
Chisquare value.
vector of lower and upper limit of the confidence interval,
or, if any of the arguments has length >1
, matrix containing
the intervals as rows.
Werner A. Stahel
confintF(5, 3, 200) ## [1] 2.107 31.95 confintF(1:5, 5, 20) ## lower limit is 0 for the first 3 f values
confintF(5, 3, 200) ## [1] 2.107 31.95 confintF(1:5, 5, 20) ## lower limit is 0 for the first 3 f values
Inference for a correlation coefficient: Collect quantities, including Relevance and Significance measures
correlation(x, y = NULL, method = c("pearson", "spearman"), hypothesis = 0, testlevel=getOption("testlevel"), rlv.threshold=getOption("rlv.threshold"), ...)
correlation(x, y = NULL, method = c("pearson", "spearman"), hypothesis = 0, testlevel=getOption("testlevel"), rlv.threshold=getOption("rlv.threshold"), ...)
x |
data for the first variable, or matrix or data.frame containing both variables |
y |
data for the second variable |
hypothesis |
the null effect to be tested, and anchor for the relevance |
method |
type of correlation, either |
testlevel |
level for the test, also determining the confidence level |
rlv.threshold |
Relevance threshold, or a vector of thresholds
from which the element |
... |
further arguments, ignored |
an object of class
'inference'
, a
vector with components
effect
: correlation, transformed with Fisher's z transformation
ciLow, ciUp
: confidence interval for the effect
Rle, Rls, Rlp
: relevance measures: estimated, secured, potential
Sig0
: significance measure for test or 0 effect
Sigth
: significance measure for test of
effect
== relevance threshold
p.value
: p value for test against 0
In addition, it has attributes
method
: type of correlation
effectname
: label for the effect
hypothesis
: the null effect
n
: number(s) of observations
estimate
: estimated correlation
conf.int
: confidence interval on correlation scale
statistic
: test statistic
data:
data.frame containing the two variables
rlv.threshold
: relevance threshold
Werner A. Stahel
see those in relevance-package
.
correlation(iris[1:50,1:2])
correlation(iris[1:50,1:2])
Blasting causes tremor in buildings, which can lead to damages. This dataset shows the relation between tremor and distance and charge of blasting.
data("d.blast")
data("d.blast")
A data frame with 388 observations on the following 7 variables.
date
date in Date format
location
Code for location of the building,
loc1
to loc8
device
Number of measuring device, 1 to 4
distance
Distance between blasting and location of measurement
charge
Charge of blast
tremor
Tremor energy (target variable)
The charge of the blasting should be controled in order to
avoid tremors that exceed a threshold.
This dataset can be used to establish the suitable rule:
For a given distance
, how large can charge
be in order
to avoid exceedance of the threshold?
Basler and Hoffmann AG, Zurich
data(d.blast) summary(lm(log10(tremor)~location+log10(distance)+log10(charge), data=d.blast))
data(d.blast) summary(lm(log10(tremor)~location+log10(distance)+log10(charge), data=d.blast))
Are answers to questions influenced by providing partial information?
Students were asked to guesstimate the height of Mount Everest. One group was 'anchored' by telling them that it was more than 2000 feet, the other group was told that it was less than 45,500 feet. The hypothesis was that respondents would be influenced by their 'anchor,' such that the first group would produce smaller numbers than the second. The true height is 29,029 feet.
The data is taken from the 'many labs' replication study (see 'source'). The first 20 values from PSU university are used here.
data("d.everest")
data("d.everest")
A data frame with 20 observations on the following 2 variables.
y
numeric: guesstimates of the height
g
factor with levels low
high
:
anchoring group
Klein RA, Ratliff KA, Vianello M et al. (2014). Investigating variation in replicability: A "many labs" replication project. Social Psychology. 2014; 45(3):142-152. https://doi.org/10.1027/1864-9335/a000178
data(d.everest) (rr <- twosamples(log(y)~g, data=d.everest, var.equal=TRUE)) print(rr, show="classical") pltwosamples(log(y)~g, data=d.everest)
data(d.everest) (rr <- twosamples(log(y)~g, data=d.everest, var.equal=TRUE)) print(rr, show="classical") pltwosamples(log(y)~g, data=d.everest)
Is a choice influenced by the formulation of the options?
Here is the question: Confronted with a new contagious disease, the government has a choice between action A that would save 200 out of 600 people or action B which would save all 600 with probability 1/3. This was the 'positive' description. The negative one was that either (A) 400 would die or (B) all 600 would die with probability 2/3.
The dataset encompasses the results for Penn State (US) and Tilburg (NL) universities.
data("d.negposChoice")
data("d.negposChoice")
A data frame with 4 observations on the following 4 variables.
uni
character: university
negpos
character: formulation of the options
A
number of students choosing option A
B
number of students choosing option B
Klein RA, Ratliff KA, Vianello M et al. (2014). Investigating variation in replicability: A "many labs" replication project. Social Psychology. 2014; 45(3):142-152. https://doi.org/10.1027/1864-9335/a000178
data(d.negposChoice) d1 <- d.negposChoice[d.negposChoice$uni=="PSU",-1] (r1 <- twosamples(table=d1[,-1])) d2 <- d.negposChoice[d.negposChoice$uni=="Tilburg",-1] r2 <- twosamples(table=d2[,-1])
data(d.negposChoice) d1 <- d.negposChoice[d.negposChoice$uni=="PSU",-1] (r1 <- twosamples(table=d1[,-1])) d2 <- d.negposChoice[d.negposChoice$uni=="Tilburg",-1] r2 <- twosamples(table=d2[,-1])
The data of the famous replication study of the Open Science Collaboration published in 2015
data("d.osc15")
data("d.osc15")
d.osc15
:
The data frame of OSC15, with 100 observations on 149 variables, of
which only the most important are described here.
For a description of all variables, see the repository
https://osf.io/jrxtm/
Study.Num
Identification number of the study
EffSize.O, EffSize.R
effect size as defined by OSC15, original paper and replication, respectively
Tst.O, Tst.R
test statistic, original and replication
N.O, N.R
number of observations, original and replication
Data repository https://osf.io/jrxtm/
Open Science Collaboration (2015). Estimating the reproducibility of psychological science. Science 349, 943-952
data(d.osc15) ## plot effect sizes of replication against original ## row 9 has an erroneous EffSize.R, and there are 4 missing effect sizes dd <- na.omit(d.osc15[-9,c("EffSize.O","EffSize.R")]) ## change sign for negative original effects dd[dd$EffSize.O<0,] <- -dd[dd$EffSize.O<0,] plot(dd) abline(h=0)
data(d.osc15) ## plot effect sizes of replication against original ## row 9 has an erroneous EffSize.R, and there are 4 missing effect sizes dd <- na.omit(d.osc15[-9,c("EffSize.O","EffSize.R")]) ## change sign for negative original effects dd[dd$EffSize.O<0,] <- -dd[dd$EffSize.O<0,] plot(dd) abline(h=0)
A small subset of the data of the famous replication study of the Open Science Collaboration published in 2015, comprising the one sample and paired sample tests, used for illustration of the determination of succcess of the replications as defined by Stahel (2022)
data("d.osc15Onesample")
data("d.osc15Onesample")
d.osc15
:
row.names
identification number of the study
teststatistico, teststatisticr
test statistic, original paper and replication, respectively
no, nr
number of observations, original and replication
effecto, effectr
effect size as defined by OSC15, original and replication
Data repository https://osf.io/jrxtm/
Open Science Collaboration (2015). Estimating the reproducibility of psychological science. Science 349, 943-952
data(d.osc15Onesample) plot(effectr~effecto, data=d.osc15Onesample, xlim=c(0,3.5),ylim=c(0,2.5), xaxs="i", yaxs="i") abline(0,1) ## Compare confidence intervals between original paper and replication to <- structure(d.osc15Onesample[,c("effecto","teststatistico","no")], names=c("effect","teststatistic","n")) tr <- structure(d.osc15Onesample[,c("effectr","teststatisticr","nr")], names=c("effect","teststatistic","n")) ( rr <- replication(to, tr, rlv.threshold=0.1) ) plconfint(rr, refline=c(0,0.1)) plconfint(attr(rr, "estimate"), refline=c(0,0.1))
data(d.osc15Onesample) plot(effectr~effecto, data=d.osc15Onesample, xlim=c(0,3.5),ylim=c(0,2.5), xaxs="i", yaxs="i") abline(0,1) ## Compare confidence intervals between original paper and replication to <- structure(d.osc15Onesample[,c("effecto","teststatistico","no")], names=c("effect","teststatistic","n")) tr <- structure(d.osc15Onesample[,c("effectr","teststatisticr","nr")], names=c("effect","teststatistic","n")) ( rr <- replication(to, tr, rlv.threshold=0.1) ) plconfint(rr, refline=c(0,0.1)) plconfint(attr(rr, "estimate"), refline=c(0,0.1))
drop1Wald
calculates tests for single term deletions based on the
covariance matrix of estimated coefficients instead of re-fitting a
reduced model. This helps in cases where re-fitting is not feasible,
inappropriate or costly.
drop1Wald(object, scope=NULL, scale = NULL, test = NULL, k = 2, ...)
drop1Wald(object, scope=NULL, scale = NULL, test = NULL, k = 2, ...)
object |
a fitted model. |
scope |
a formula giving the terms to be considered for dropping. If 'NULL', 'drop.scope(object)' is obtained |
scale |
an estimate of the residual mean square to be used in computing Cp. Ignored if '0' or 'NULL'. |
test |
see |
k |
the penalty constant in AIC / Cp. |
... |
further arguments, ignored |
The test statistics and Cp and AIC values are calculated on the basis
of the estimated coefficients and their (unscaled) covariance matrix
as provided by the fit object.
The function may be used for all model fitting objects that contain
these two components as $coefficients
and $cov.unscaled
.
An object of class 'anova' summarizing the differences in fit between the models.
drop1Wald is used for models of class 'lm' or 'lmrob' for preparing
a termtable
.
Werner A. Stahel
data(d.blast) r.blast <- lm(log10(tremor)~location+log10(distance)+log10(charge), data=d.blast) drop1(r.blast) drop1Wald(r.blast) ## Example from example(glm) dd <- data.frame(treatment = gl(3,3), outcome = gl(3,1,9), counts = c(18,17,15,20,10,20,25,13,12)) r.glm <- glm(counts ~ outcome + treatment, data = dd, family = poisson()) drop1(r.glm, test="Chisq") drop1Wald(r.glm)
data(d.blast) r.blast <- lm(log10(tremor)~location+log10(distance)+log10(charge), data=d.blast) drop1(r.blast) drop1Wald(r.blast) ## Example from example(glm) dd <- data.frame(treatment = gl(3,3), outcome = gl(3,1,9), counts = c(18,17,15,20,10,20,25,13,12)) r.glm <- glm(counts ~ outcome + treatment, data = dd, family = poisson()) drop1(r.glm, test="Chisq") drop1Wald(r.glm)
Allows for dropping observations (rows) determined by row names or factor levels from a data.frame or matrix.
dropdata(data, rowid = NULL, incol = "row.names", colid = NULL)
dropdata(data, rowid = NULL, incol = "row.names", colid = NULL)
data |
a data.frame of matrix |
rowid |
vector of character strings identifying the rows to be dropped |
incol |
name or index of the column used to identify the observations (rows) |
colid |
vector of character strings identifying the columns to be dropped |
The data.frame or matrix without the dropped observations and/or variables. Attributes are passed on.
Ordinary subsetting by [...,...]
drops attributes.
Furthermore, the convenient way to drop rows or columns by giving
negative indices to [...,...]
cannot be used
with names of rows or columns.
Werner A. Stahel, ETH Zurich
dd <- data.frame(rbind(a=1:3,b=4:6,c=7:9,d=10:12)) dropdata(dd,"b") dropdata(dd, col="X3") d1 <- dropdata(dd,"d") d2 <- dropdata(d1,"b") naresid(attr(d2,"na.action"),as.matrix(d2)) dropdata(letters, 3:5)
dd <- data.frame(rbind(a=1:3,b=4:6,c=7:9,d=10:12)) dropdata(dd,"b") dropdata(dd, col="X3") d1 <- dropdata(dd,"d") d2 <- dropdata(d1,"b") naresid(attr(d2,"na.action"),as.matrix(d2)) dropdata(letters, 3:5)
dropNA
returns the vector 'x', without elements that are NA or NaN
or, if 'inf' is TRUE, equal to Inf or -Inf.
replaceNA
replaces these values by values from the second argument
dropNA(x, inf = TRUE) replaceNA(x, na, inf = TRUE)
dropNA(x, inf = TRUE) replaceNA(x, na, inf = TRUE)
x |
vector from which the non-real values should be dropped or replaced |
na |
replacement or vector from which the replacing values are taken. |
inf |
logical: should 'Inf' and '-Inf' be considered "non-real"? |
For dropNA
: Vector containing the 'real' values
of 'x' only
For replaceNA
: Vector with 'non-real' values replaced by
the respective elements of na
.
The differences to 'na.omit(x)' are: 'Inf' and '-Inf' are also dropped, unless 'inf==FALSE'.\ no attribute 'na.action' is appended.
Werner A. Stahel
dd <- c(1, NA, 0/0, 4, -1/0, 6) dropNA(dd) na.omit(dd) replaceNA(dd, 99) replaceNA(dd, 100+1:6)
dd <- c(1, NA, 0/0, 4, -1/0, 6) dropNA(dd) na.omit(dd) replaceNA(dd, 99) replaceNA(dd, 100+1:6)
Recodes the NA
entries in output by a desired code
like " .
"
formatNA(x, na.print = " .", digits = getOption("digits"), ...)
formatNA(x, na.print = " .", digits = getOption("digits"), ...)
x |
object to be printed, usually a numeric vector or data.frame |
na.print |
code to be used for |
digits |
number of digits for formatting numeric values |
... |
other arguments to |
The na.encode
argument of print
only applies to
character objects. formatNA
does the same for numeric arguments.
Should mimik the value of format
Werner A. Stahel
formatNA(c(1,NA,3)) dd <- data.frame(X=c(1,NA,3), Y=c(4,5, NA), g=factor(c("a",NA,"b"))) (rr <- formatNA(dd, na.print="???")) str(rr)
formatNA(c(1,NA,3)) dd <- data.frame(X=c(1,NA,3), Y=c(4,5, NA), g=factor(c("a",NA,"b"))) (rr <- formatNA(dd, na.print="???")) str(rr)
Retrieve the table of coefficients and standard errors, or the scale parameter, or the factors needed for standardizing coefficients from diverse model fitting results
getcoeftable(object) getscalepar(object) getcoeffactor(object, standardize = TRUE)
getcoeftable(object) getscalepar(object) getcoeffactor(object, standardize = TRUE)
object |
an R object resulting from a model fitting function |
standardize |
ligical: should a scaling factor for
the response variable be determined (calling |
Object regrModelClasses
contains the names of the
classes for which the result should work.
For other model classes, the function is not tested and may fail.
For getcoeftable
:
Matrix containing at least the two columns containing the estimated
coefficients (first column) and the standard errors (second column).
For getscalepar
: scale parameter.
For getcoeffactor
: vector of multiplicative factors,
with attributes
scale
, fitclass
and family
or dist
according to object
.
Werner A. Stahel
rr <- lm(Fertility ~ . , data = swiss) getcoeftable(rr) # identical to coef(summary(rr)) or also summary(rr)$coefficients getscalepar(rr) if(requireNamespace("survival", quietly=TRUE)) { data(ovarian) ## , package="survival" rs <- survival::survreg(survival::Surv(futime, fustat) ~ ecog.ps + rx, data = ovarian, dist = "weibull") getcoeftable(rs) getcoeffactor(rs) }
rr <- lm(Fertility ~ . , data = swiss) getcoeftable(rr) # identical to coef(summary(rr)) or also summary(rr)$coefficients getscalepar(rr) if(requireNamespace("survival", quietly=TRUE)) { data(ovarian) ## , package="survival" rs <- survival::survreg(survival::Surv(futime, fustat) ~ ecog.ps + rx, data = ovarian, dist = "weibull") getcoeftable(rs) getcoeffactor(rs) }
Calculates confidence intervals and relevance and significance values given estimates, standard errors and, for relevance, additional quantities.
inference(object = NULL, estimate = NULL, teststatistic = NULL, se = NA, n = NULL, df = NULL, stcoef = TRUE, rlv = TRUE, rlv.threshold = getOption("rlv.threshold"), testlevel = getOption("testlevel"), ...)
inference(object = NULL, estimate = NULL, teststatistic = NULL, se = NA, n = NULL, df = NULL, stcoef = TRUE, rlv = TRUE, rlv.threshold = getOption("rlv.threshold"), testlevel = getOption("testlevel"), ...)
object |
A data.frame containing, as its variables,
the arguments
... or a model fit object |
estimate |
estimate(s) of the parameter(s) |
teststatistic |
test statistic(s) |
se |
standard error(s) of the estimate(s) |
n |
number(s) of observations |
df |
degrees of freedom of the residuals |
stcoef |
standardized coefficients.
If |
rlv |
logical: Should relevances be calculated? |
rlv.threshold |
Relevance threshold(s). May be a simple number for simple inference, or a vector containing the elements
|
testlevel |
1 - confidence level |
... |
furter arguments, passed to
|
The estimates divided by standard errors are assumed to be
t-distributed with df
degrees of freedom.
For df==Inf
, this is the standard normal distribution.
A data.frame of class "inference"
, with the variables
effect , se
|
estimated effect(s), often coefficients, and their standard errors |
ciLow , ciUp
|
lower and upper limit of the confidence interval |
teststatistic |
t-test statistic |
p.value |
p value |
Sig0 |
significance value, i.e., test statistic divided by
critical value, which in turn is the |
ciLow , ciUp
|
confidence interval for |
If rlv
is TRUE
,
stcoef |
standardized coefficient |
st.Low , st.Up
|
confidence interval for |
Rle |
estimated relevance of |
Rls |
secured relevance, lower end of confidence interval
for the relevance of |
Rlp |
potential relevance, upper end of confidence interval ... |
Rls.symbol |
symbols for the secured relevance |
Rlvclass |
relevance class |
Werner A. Stahel
Werner A. Stahel (2020). New relevance and significance measures to replace p-values. PLOS ONE 16, e0252991, doi: 10.1371/journal.pone.0252991
link{twosamples}
,
link{termtable}, link{termeffects}
data(d.blast) rr <- lm(log10(tremor)~location+log10(distance)+log10(charge), data=d.blast) inference(rr)
data(d.blast) rr <- lm(log10(tremor)~location+log10(distance)+log10(charge), data=d.blast) inference(rr)
Selects or drops the last element or the last n
elements of a
vector or the last n
rows or ncol
columns of a matrix
last(data, n = NULL, ncol=NULL, drop=is.matrix(data))
last(data, n = NULL, ncol=NULL, drop=is.matrix(data))
data |
vector or matrix or data.frame from which to select or drop |
n |
if >0, |
ncol |
if |
drop |
if only one row or column of a matrix (or one column of a data.frame) is selected or left over, should the result be a vector or a row or column matrix (or one variable data.frame) |
The selected elements of the vector or matrix or data.frame
This is a very simple function. It is defined mainly for selecting from the results of other functions without storing them.
Werner Stahel
x <- runif(rpois(1,10)) last(sort(x), 3) last(sort(x), -5) ## df <- data.frame(X=c(2,5,3,8), F=LETTERS[1:4], G=c(TRUE,FALSE,FALSE,TRUE)) last(df,3,-2)
x <- runif(rpois(1,10)) last(sort(x), 3) last(sort(x), -5) ## df <- data.frame(X=c(2,5,3,8), F=LETTERS[1:4], G=c(TRUE,FALSE,FALSE,TRUE)) last(df,3,-2)
Transforms the data by a log10 transformation, modifying small and zero observations such that the transformation yields finite values.
logst(data, calib=data, threshold=NULL, mult = 1)
logst(data, calib=data, threshold=NULL, mult = 1)
data |
a vector or matrix of data, which is to be transformed |
calib |
a vector or matrix of data used to calibrate the
transformation(s),
i.e., to determine the constant |
threshold |
constant c that determines the transformation, possibly a vector with a value for each variable. |
mult |
a tuning constant affecting the transformation of small values, see Details |
Small values are determined by the threshold c. If not given by the
argument threshold
, then it is determined by the quartiles
and
of the non-zero data as those
smaller than
.
The rationale is that for lognormal data, this constant identifies
2 percent of the data as small.
Beyond this limit, the transformation continues linear with the
derivative of the log curve at this point. See code for the formula.
The function chooses log10 rather than natural logs because they can be backtransformed relatively easily in the mind.
the transformed data. The value c needed for the transformation is
returned as attr(.,"threshold")
.
The names of the function alludes to Tudey's idea of "started logs".
Werner A. Stahel, ETH Zurich
dd <- c(seq(0,1,0.1),5*10^rnorm(100,0,0.2)) dd <- sort(dd) r.dl <- logst(dd) plot(dd, r.dl, type="l") abline(v=attr(r.dl,"threshold"),lty=2)
dd <- c(seq(0,1,0.1),5*10^rnorm(100,0,0.2)) dd <- sort(dd) r.dl <- logst(dd) plot(dd, r.dl, type="l") abline(v=attr(r.dl,"threshold"),lty=2)
copy of ovarian from package 'survival'. Will disappear
data("ovarian")
data("ovarian")
A data frame with 26 observations on the following 6 variables.
futime
a numeric vector
fustat
a numeric vector
age
a numeric vector
resid.ds
a numeric vector
rx
a numeric vector
ecog.ps
a numeric vector
This copy is here since the package was rejected because the checking procedure did not find it in the package
data(ovarian) summary(ovarian)
data(ovarian) summary(ovarian)
Plot confidence or relevance interval(s) for several samples and for the comparison of two samples, also useful for replications and original studies
plconfint(x, y = NULL, select=NULL, overlap = NULL, pos = NULL, xlim = NULL, refline = 0, add = FALSE, bty = "L", col = NULL, plpars = list(lwd=c(2,3,1,4,2), posdiff=0.35, markheight=c(1, 0.6, 0.6), extend=NA, reflinecol="gray70"), label = TRUE, label2 = NULL, xlab="", ...) pltwosamples(x, ...) ## Default S3 method: pltwosamples(x, y = NULL, overlap = TRUE, ...) ## S3 method for class 'formula' pltwosamples(formula, data = NULL, ...)
plconfint(x, y = NULL, select=NULL, overlap = NULL, pos = NULL, xlim = NULL, refline = 0, add = FALSE, bty = "L", col = NULL, plpars = list(lwd=c(2,3,1,4,2), posdiff=0.35, markheight=c(1, 0.6, 0.6), extend=NA, reflinecol="gray70"), label = TRUE, label2 = NULL, xlab="", ...) pltwosamples(x, ...) ## Default S3 method: pltwosamples(x, y = NULL, overlap = TRUE, ...) ## S3 method for class 'formula' pltwosamples(formula, data = NULL, ...)
x |
For
For |
y |
data for a second confidence interval (for |
select |
selects samples, effects, or studies |
overlap |
logical: should shortened intervals be shown to show significance of differences? see Details |
pos |
positions of the bars in vertical direction |
xlim |
limits for the horizontal axis. |
refline |
|
add |
logical: should the plotted elements be added to an existing plot? |
bty |
type of 'box' around the plot, see |
col |
color to be used for the confidence intervals, usually a vector of colors if used. |
plpars |
graphical options, see Details |
label , label2
|
labels for intervals (or intervall pairs)
to be dislayed on the left and right hand margin, respectivly.
If |
xlab |
label for horizontal axis |
formula , data
|
formula and data for the |
... |
further arguments to the call of |
Columns 4 and 5 of x
are typically used to indicate
an "overlap interval", which allows for a graphical assessment
of the significance of the test for zero difference(s),
akin the "notches" in box plots:
The difference between a pair of groups is siginificant if their
overlap intervals do not overlap.
For equal standard errors of the groups, the standard error of the
difference between two of them is larger by the factor sqrt(2)
.
Therefore, the intervals should be shortened by this factor, or
multiplied by 1/sqrt(2)
, which is the default for
overlapfactor
.
If only two groups are to be shown, the factor is adjusted to unequal
standard errors, and accurate quantiles of a t distribution are used.
The graphical options are:
lwd
: line widths for: [1] the interval, [2] middle mark, [3] end marks, [4] overlap interval marks, [5] vertical line marking the relevance threshold
markheight
: determines the length of the middle mark, the end marks and the marks for the overlap interval as a multiplier of the default length
extend
: extension of the vertical axis beyond the range
reflinecol
: color to be used for the vertical lines at relevances 0 and 1
none
Werner A. Stahel
## --- regression data(swiss) rr <- lm(Fertility ~ . , data = swiss) rt <- termtable(rr) plot(rt) ## --- termeffects data(d.blast) rlm <- lm(log10(tremor)~location+log10(distance)+log10(charge), data=d.blast) rte <- termeffects(rlm) plot(rte, single=TRUE) ## --- replication data(d.osc15Onesample) td <- d.osc15Onesample tdo <- structure(td[,c(1,2,6)], names=c("effect", "n", "teststatistic")) tdr <- structure(td[,c(3,4,7)], names=c("effect", "n", "teststatistic")) rr <- replication(tdo,tdr) plconfint(attr(rr, "estimate"), refline=c(0,1))
## --- regression data(swiss) rr <- lm(Fertility ~ . , data = swiss) rt <- termtable(rr) plot(rt) ## --- termeffects data(d.blast) rlm <- lm(log10(tremor)~location+log10(distance)+log10(charge), data=d.blast) rte <- termeffects(rlm) plot(rte, single=TRUE) ## --- replication data(d.osc15Onesample) td <- d.osc15Onesample tdo <- structure(td[,c(1,2,6)], names=c("effect", "n", "teststatistic")) tdr <- structure(td[,c(3,4,7)], names=c("effect", "n", "teststatistic")) rr <- replication(tdo,tdr) plconfint(attr(rr, "estimate"), refline=c(0,1))
Plot confidence or relevance interval(s) for one or several items
## S3 method for class 'inference' plot(x, pos = NULL, overlap = FALSE, refline = c(0,1,-1), xlab = "relevance", ...) ## S3 method for class 'termeffects' plot(x, pos = NULL, single=FALSE, overlap = TRUE, termeffects.gap = 0.2, refline = c(0, 1, -1), xlim=NULL, ylim=NULL, xlab = "relevance", mar=NA, labellength=getOption("labellength"), ...)
## S3 method for class 'inference' plot(x, pos = NULL, overlap = FALSE, refline = c(0,1,-1), xlab = "relevance", ...) ## S3 method for class 'termeffects' plot(x, pos = NULL, single=FALSE, overlap = TRUE, termeffects.gap = 0.2, refline = c(0, 1, -1), xlim=NULL, ylim=NULL, xlab = "relevance", mar=NA, labellength=getOption("labellength"), ...)
x |
a vector or matrix of class |
pos |
positions of the bars in vertical direction |
overlap |
logical: should shortened intervals be shown to show significance of differences? see Details |
refline |
values for vertical reference lines |
single |
logical: should terms with a single degree of freedom be plotted? |
termeffects.gap |
gap between blocks corresponding to terms |
xlim , ylim
|
limits of plotting area, as usual |
xlab |
label for horizontal axis |
mar |
plot margins. If |
labellength |
maximum number of characters for label strings |
... |
further arguments to the call of |
The overlap interval allows for a graphical assessment
of the significance of the test for zero difference(s),
akin the notches in the box plots:
The difference between a pair of groups is siginificant if their
overlap intervals do not overlap.
For equal standard errors of the groups, the standard error of the
difference between two of them is larger by the factor sqrt(2)
.
Therefore, the intervals should be shortened by this factor, or
multiplied by 1/sqrt(2)
, which is the default for
overlapfactor
.
If only two groups are to be shown, the factor is adjusted to unequal
standard errors.
The graphical options are:
lwd
: line widths for: [1] the interval, [2] middle mark, [3] end marks, [4] overlap interval marks, [5] vertical line marking the relevance threshold
markheight
: determines the length of the middle mark, the end marks and the marks for the overlap interval as a multiplier of the default length
extend
: extension of the vertical axis beyond the range
framecol
: color to be used for the framing lines: axis and vertical lines at relevances 0 and 1
none
plot.inference
displays termtable
objects, too,
since they inherit from class inference
.
Werner A. Stahel
## --- regression data(swiss) rr <- lm(Fertility ~ . , data = swiss) rt <- termtable(rr) plot(rt) ## --- termeffects data(d.blast) rlm <- lm(log10(tremor)~location+log10(distance)+log10(charge), data=d.blast) rte <- termeffects(rlm) plot(rte, single=TRUE)
## --- regression data(swiss) rr <- lm(Fertility ~ . , data = swiss) rt <- termtable(rr) plot(rt) ## --- termeffects data(d.blast) rlm <- lm(log10(tremor)~location+log10(distance)+log10(charge), data=d.blast) rte <- termeffects(rlm) plot(rte, single=TRUE)
Print methods for objects of class
"inference"
, "termtable"
, "termeffects"
,
or "printInference"
.
## S3 method for class 'inference' print(x, show = getOption("show.inference"), print=TRUE, digits = getOption("digits.reduced"), transpose.ok = TRUE, legend = NULL, na.print = getOption("na.print"), ...) ## S3 method for class 'termtable' print(x, show = getOption("show.inference"), ...) ## S3 method for class 'termeffects' print(x, show = getOption("show.inference"), transpose.ok = TRUE, single = FALSE, print = TRUE, warn = TRUE, ...) ## S3 method for class 'printInference' print(x, ...)
## S3 method for class 'inference' print(x, show = getOption("show.inference"), print=TRUE, digits = getOption("digits.reduced"), transpose.ok = TRUE, legend = NULL, na.print = getOption("na.print"), ...) ## S3 method for class 'termtable' print(x, show = getOption("show.inference"), ...) ## S3 method for class 'termeffects' print(x, show = getOption("show.inference"), transpose.ok = TRUE, single = FALSE, print = TRUE, warn = TRUE, ...) ## S3 method for class 'printInference' print(x, ...)
x |
object to be printed |
show |
determines items (columns) to be shown |
digits |
number of significant digits to be printed |
transpose.ok |
logical: May a single column be shown as a row? |
single |
logical: Should components with a single coefficient be printed? |
legend |
logical: should the legend(s) for the symbols
characterizing p-values and relevances be printed?
Defaults to |
na.print |
string by which |
print |
logical: if |
warn |
logical: Should the warning be issued if
|
... |
further arguments, passed to |
The value, if assigned to rr
, say, can be printed by using
print.printInference
, writing print(rr)
, which is just
what happens internally unless print=FALSE
is used.
This allows for editing the result before printing it, see Examples.
printInference
objects can be a vector, a data.frame or a
matrix, or a list of such items.
Each item can have an attribute head
of mode character that is
printed by cat
before the item, and analogous with a
tail
attribute.
A kind of formatted version of x
, with class
printInference
.
For print.inference
, it will be
a character vector or a data.frame with attributes
head
and tail
if applicable.
For print.termeffects
, it will be a list of such elements,
with its own head
and tail
.
It is invisibly returned.
Werner A. Stahel
twosamples
, termtable
,
termeffects
, inference
.
data(d.blast) r.blast <- lm(log10(tremor)~location+log10(distance)+log10(charge), data=d.blast) rt <- termtable(r.blast) ## print() : first default, then "classical" : rt print(rt, show="classical") class(te <- termeffects(r.blast)) # "termeffects" rr <- print(te, print=FALSE) attr(rr, "head") <- sub("lm", "Linear Regression", attr(rr, "head")) class(rr) # "printInference" rr # <==> print(rr) str(rr)
data(d.blast) r.blast <- lm(log10(tremor)~location+log10(distance)+log10(charge), data=d.blast) rt <- termtable(r.blast) ## print() : first default, then "classical" : rt print(rt, show="classical") class(te <- termeffects(r.blast)) # "termeffects" rr <- print(te, print=FALSE) attr(rr, "head") <- sub("lm", "Linear Regression", attr(rr, "head")) class(rr) # "printInference" rr # <==> print(rr) str(rr)
List of options used in the relevnance package to select items and formats for printing inference elements
relevance.options rlv.symbols p.symbols
relevance.options rlv.symbols p.symbols
The format is: List of 22 $ digits.reduced : 3 $ testlevel : 0.05 $ rlv.threshold : stand rel prop corr coef drop pred 0.10 0.10 0.10 0.10 0.10 0.10 0.05 $ termtable : TRUE $ show.confint : TRUE $ show.doc : TRUE $ show.inference : "relevance" $ show.simple.relevance : "Rle" "Rlp" "Rls" "Rls.symbol" $ show.simple.test : "Sig0" "p.symbol" $ show.simple.classical : "statistic" "p.value" "p.symbol" $ show.term.relevance : "df" "R2.x" "coefRlp" "coefRls" ... $ show.term.test : "df" "ciLow" "ciUp" "R2.x" ... $ show.term.classical : "statistic" "df" "ciLow" "ciUp" ... $ show.termeff.relevance: "coef" "coefRls.symbol" $ show.termeff.test : "coef" "p.symbol" $ show.termeff.classical: "coef" "p.symbol" $ show.symbollegend : TRUE $ na.print : "." $ p.symbols : List, see below $ rlv.symbols : List, see below
rlv.symbols List $ symbol : " " "." "+" "++" "+++" $ cutpoint: -Inf 0 1 2 5 Inf
p.symbols List $ symbol : "***" "**" "*" "." " " $ cutpoint: 0 0.001 0.01 0.05 0.1 1
relevance.options options(relevance.options) ## restores the package's default options
relevance.options options(relevance.options) ## restores the package's default options
Calculate inference for a replication study and for its comparison with the original
replication(original, replication, testlevel=getOption("testlevel"), rlv.threshold=getOption("rlv.threshold") )
replication(original, replication, testlevel=getOption("testlevel"), rlv.threshold=getOption("rlv.threshold") )
original |
list of class |
replication |
the same, for the replication study;
if empty or |
testlevel |
level of statistical tests |
rlv.threshold |
threshold of relevance; if this is a vector, the first element will be used. |
A list of class inference
and replication
containing the results of the comparison between the studies
and, as an attribute, the results for the replication.
Werner A. Stahel
Werner A. Stahel (2020). Measuring Significance and Relevance instead of p-values. Submitted; available in the documentation.
data(d.osc15Onesample) tx <- structure(d.osc15Onesample[,c("effecto","teststatistico","no")], names=c("effect","teststatistic","n")) ty <- structure(d.osc15Onesample[,c("effectr","teststatisticr","nr")], names=c("effect","teststatistic","n")) replication(tx, ty, rlv.threshold=0.1)
data(d.osc15Onesample) tx <- structure(d.osc15Onesample[,c("effecto","teststatistico","no")], names=c("effect","teststatistic","n")) ty <- structure(d.osc15Onesample[,c("effectr","teststatisticr","nr")], names=c("effect","teststatistic","n")) replication(tx, ty, rlv.threshold=0.1)
Find the class of relevance on the basis of the confidence interval and the relevance threshold
rlvClass(effect, ci=NULL, relevance=NA)
rlvClass(effect, ci=NULL, relevance=NA)
effect |
either a list of class |
ci |
confidence interval for |
relevance |
relevance threshold |
Character string: the relevance class, either
"Rlv"
if the effect is statistically proven to be
larger than the threshold,
"Amb"
if the confidence interval contains the threshold,
"Ngl"
if the interval only covers values
lower than the threshold, but contains 0
, and
"Ctr"
if the interval only contains negative values.
Werner A. Stahel
Werner A. Stahel (2020). New relevance and significance measures to replace p-values. PLOS ONE 16, e0252991, doi: 10.1371/journal.pone.0252991
rlvClass(2.3, 1.6, 0.4) ## "Rlv" rlvClass(2.3, 1.6, 1) ## "Sig"
rlvClass(2.3, 1.6, 0.4) ## "Rlv" rlvClass(2.3, 1.6, 1) ## "Sig"
Find the classes of relevance and of reprodicibility.
rplClass(rlvclassd, rlvclassr, rler=NULL)
rplClass(rlvclassd, rlvclassr, rler=NULL)
rlvclassd |
relevance class of the difference between rplication and original study |
rlvclassr |
relevance class of the replication's effect estimate |
rler |
estimated relevance of the replication |
Character string: the replication outcome class
Werner A. Stahel
Werner A. Stahel (2020). Measuring Significance and Relevance instead of p-values. Submitted
data(d.osc15Onesample) tx <- structure(d.osc15Onesample[,c("effecto","teststatistico","no")], names=c("effect","teststatistic","n")) ty <- structure(d.osc15Onesample[,c("effectr","teststatisticr","nr")], names=c("effect","teststatistic","n")) rplClass(tx, ty)
data(d.osc15Onesample) tx <- structure(d.osc15Onesample[,c("effecto","teststatistico","no")], names=c("effect","teststatistic","n")) ty <- structure(d.osc15Onesample[,c("effectr","teststatisticr","nr")], names=c("effect","teststatistic","n")) rplClass(tx, ty)
Strings are shortened if they are longer than
n
shortenstring(x, n = 50, endstring = "..", endchars = NULL)
shortenstring(x, n = 50, endstring = "..", endchars = NULL)
x |
a string or a vector of strings |
n |
maximal character length |
endstring |
string(s) to be appended to the shortened strings |
endchars |
number of last characters to be shown at the end of
the abbreviated string. By default, it adjusts to |
Abbreviated string(s)
Werner A. Stahel
shortenstring("abcdefghiklmnop", 8) shortenstring(c("aaaaaaaaaaaaaaaaaaaaaa","bbbbc", "This text is certainly too long, don't you think?"),c(8,3,20))
shortenstring("abcdefghiklmnop", 8) shortenstring(c("aaaaaaaaaaaaaaaaaaaaaa","bbbbc", "This text is certainly too long, don't you think?"),c(8,3,20))
Shows a part of the data.frame which allows for grasping the nature of the data. The function is typically used to make sure that the data is what was desired and to grasp the nature of the variables in the phase of getting acquainted with the data.
showd(data, first = 3, nrow. = 4, ncol. = NULL, digits=getOption("digits"))
showd(data, first = 3, nrow. = 4, ncol. = NULL, digits=getOption("digits"))
data |
a data.frame, a matrix, or a vector |
first |
the first |
nrow. |
a selection of |
ncol. |
number of columns (variables) to be shown. The first and
last columns will also be included. If |
digits |
number of significant digits used in formatting numbers |
returns invisibly the character vector containing the formatted data
Werner A. Stahel, ETH Zurich
showd(iris) data(d.blast) names(d.blast) ## only show 3 columns, including the first and last showd(d.blast, ncol=3) showd(cbind(1:100))
showd(iris) data(d.blast) names(d.blast) ## only show 3 columns, including the first and last showd(d.blast, ncol=3) showd(cbind(1:100))
Count the missing or non-finite values for each column of a matrix or data.frame
sumNA(object, inf = TRUE)
sumNA(object, inf = TRUE)
object |
a vector, matrix, or data.frame |
inf |
if TRUE, Inf and NaN values are counted along with NAs |
numerical vector containing the missing value counts for each column
This is a simple shortcut for apply(is.na(object),2,sum)
or apply(!is.finite(object),2,sum)
Werner A. Stahel, ETH Zurich
t.d <- data.frame(V1=c(1,2,NA,4), V2=c(11,12,13,Inf), V3=c(21,NA,23,Inf)) sumNA(t.d)
t.d <- data.frame(V1=c(1,2,NA,4), V2=c(11,12,13,Inf), V3=c(21,NA,23,Inf)) sumNA(t.d)
A list of all coefficients of a model fit, possibly with respective statistics
termeffects(object, se = 2, df = df.residual(object), rlv = TRUE, rlv.threshold = getOption("rlv.threshold"), ...)
termeffects(object, se = 2, df = df.residual(object), rlv = TRUE, rlv.threshold = getOption("rlv.threshold"), ...)
object |
a model fit, produced, e.g., by a call to |
se |
logical: Should inference statistics be generated? |
df |
degrees of freedom for t-test |
rlv |
logical: Should relevances be calculated? |
rlv.threshold |
Relevance thresholds, see |
... |
further arguments, passed to |
a list
with a component for each term in the model formula.
Each component is a termtable
for the coefficients
corresponding to the term.
Werner A. Stahel
dummy.coef, inference, termtable
data(d.blast) r.blast <- lm(log10(tremor)~location+log10(distance)+log10(charge), data=d.blast) termeffects(r.blast)
data(d.blast) r.blast <- lm(log10(tremor)~location+log10(distance)+log10(charge), data=d.blast) termeffects(r.blast)
Calculate a table of statistics for (multiple) regression mdels with a linear predictor
termtable(object, summary = summary(object), testtype = NULL, r2x = TRUE, rlv = TRUE, rlv.threshold = getOption("rlv.threshold"), testlevel = getOption("testlevel"), ...) relevance.modelclasses
termtable(object, summary = summary(object), testtype = NULL, r2x = TRUE, rlv = TRUE, rlv.threshold = getOption("rlv.threshold"), testlevel = getOption("testlevel"), ...) relevance.modelclasses
object |
result of a model fitting function like |
summary |
result of |
testtype |
type of test to be applied for dropping each term in
turn. If |
r2x |
logical: should the collinearity measures “ |
rlv |
logical: Should relevances be calculated? |
rlv.threshold |
Relevance thresholds, vector containing the elements
|
testlevel |
1 - confidence level |
... |
further arguments, ignored |
relevance.modelclasses
collects the names of classes of model
fitting results that can be handled by termtable
.
If testtype
is not specified, it is determined by the class of
object
and its attribute family
as follows:
"F"
: or t for objects of class lm, lmrob
and glm
with families quasibinomial
and quasipoisson
,
"Chi-squared"
: for other glm
s and survreg
data.frame
with columns
coef
: coefficients for terms with a single degree of freedom
df
: degrees of freedom
se
: standard error of coef
statistic
: value of the test statistic
p.value, p.symbol
: p value and symbol for it
Sig0
: significance value for the test of coef==0
ciLow, ciUp
: confidence interval for coef
stcoef
: standardized coefficient (standardized using
the standard deviation of the 'error' term, sigma
,
instead of the response's standard deviation)
st.Low, st.Up
: confidence interval for stcoef
R2.x
: collinearity measure
(, where
is the variance inflation factor)
coefRle
: estimated relevance of coef
coefRls
: secured relevance, lower end of confidence interval
for the relevance of coef
coefRlp
: potential relevance, the upper end of the confidence interval.
dropRle, dropRls, dropRlp
: analogous values for drop effect
predRle, predRls, predRlp
: analogous values for prediction effect
In addition, it has attributes
testtype
: as determined by the argument testtype
or
the class and attributes of object
.
fitclass
: class and attributes of object
.
family, dist
: more specifications if applicable
Werner A. Stahel
Werner A. Stahel (2020). Measuring Significance and Relevance instead of p-values. Submitted
getcoeftable
;
for printing options, print.inference
data(swiss) rr <- lm(Fertility ~ . , data = swiss) rt <- termtable(rr) rt
data(swiss) rr <- lm(Fertility ~ . , data = swiss) rt <- termtable(rr) rt
Inference for a difference between two independent samples or for a single sample: Collect quantities for inference, including Relevance and Significance measures
twosamples(x, ...) onesample(x, ...) ## Default S3 method: twosamples(x, y = NULL, paired = FALSE, table = NULL, hypothesis = 0,var.equal = TRUE, testlevel=getOption("testlevel"), log = NULL, standardize = NULL, rlv.threshold=getOption("rlv.threshold"), ...) ## S3 method for class 'formula' twosamples(x, data = NULL, subset, na.action, log = NULL, ...) ## S3 method for class 'table' twosamples(x, ...)
twosamples(x, ...) onesample(x, ...) ## Default S3 method: twosamples(x, y = NULL, paired = FALSE, table = NULL, hypothesis = 0,var.equal = TRUE, testlevel=getOption("testlevel"), log = NULL, standardize = NULL, rlv.threshold=getOption("rlv.threshold"), ...) ## S3 method for class 'formula' twosamples(x, data = NULL, subset, na.action, log = NULL, ...) ## S3 method for class 'table' twosamples(x, ...)
x |
a formula or the data for the first or the single sample |
y |
data for the second sample |
table |
A |
paired |
logical: In case |
hypothesis |
the null effect to be tested, and anchor for the relevance |
var.equal |
logical: In case of two samples, should the variances be assumed equal? Only applies for quantitative data. |
testlevel |
level for the test, also determining the confidence level |
log |
logical...: Is the target variable on log scale? – or character: either "log" or "log10" (or "logst"). If so, no standardization is applied to it. By default, the function examines the formula to check whether the left hand side of the formula contains a log transformation. |
standardize |
logical: Should the effect be standardized (for quantiative data)? |
rlv.threshold |
Relevance threshold, or a vector of thresholds
from which the element |
For the formula
method:
formula |
formula of the form y~x giving the target y and condition x variables. For a one-sample situation, use y~1. |
data |
data from which the variables are obtained |
subset , na.action
|
subset and na.action to be applied to
|
... |
further arguments, ignored |
Argument log
: If log10
(or logst
from
package plgraphics
) is used, rescaling is done
(by log(10)
) to obtain the correct relevance.
Therefore, log
needs to be set appropriately in this case.
an object of class
'inference'
, a
vector with elements
effect
: for quantitative data: estimated difference between expectations of the two samples, or mean in case of a single sample.
For binary data: log odds (for one sample or paired samples) or log odds ratio (for two samples)
se
: standard error of effect
teststatistic
: test statistic
p.value
: p value for test against 0
Sig0
: significance measure for test or 0 effect
ciLow, ciUp
: confidence interval for the effect
Rle, Rls, Rlp
: relevance measures: estimated, secured, potential
Sigth
: significance measure for test of
effect
== relevance threshold
In addition to the columns/components, it has attributes
type
: type of relevance: simple
method
: problem and inference method
effectname
: label for the effect
hypothesis
: the null effect
n
: number(s) of observations
estimate
: estimated parameter, with standard error or confidence interval, if applicable; in the case of 2 independent samples: their means
teststatistic
: test statistic
V
: single observation variance
df
: degrees of freedom for the t distribution
data:
if paired, vector of differences; if single sample, vector of data; if two independent samples, list containing the two samples
rlv.threshold
: relevance threshold
onesample
and twosamples
are identical.
twosamples.table(x,...)
just calls
twosamples.default(table=x, ...)
.
Werner A. Stahel
see those in relevance-package
.
t.test, binom.test, fisher.test,
mcnemar.test
data(sleep) t.test(sleep[sleep$group == 1, "extra"], sleep[sleep$group == 2, "extra"]) twosamples(sleep[sleep$group == 1, "extra"], sleep[sleep$group == 2, "extra"]) ## Two-sample test, wilcox.test example, Hollander & Wolfe (1973), 69f. ## Permeability constants of the human chorioamnion (a placental membrane) ## at term and between 12 to 26 weeks gestational age d.permeabililty <- data.frame(perm = c(0.80, 0.83, 1.89, 1.04, 1.45, 1.38, 1.91, 1.64, 0.73, 1.46, 1.15, 0.88, 0.90, 0.74, 1.21), atterm = rep(1:0, c(10,5)) ) t.test(perm~atterm, data=d.permeabililty) twosamples(perm~atterm, data=d.permeabililty) ## one sample onesample(sleep[sleep$group == 2, "extra"]) ## plot two samples pltwosamples(extra ~ group, data=sleep)
data(sleep) t.test(sleep[sleep$group == 1, "extra"], sleep[sleep$group == 2, "extra"]) twosamples(sleep[sleep$group == 1, "extra"], sleep[sleep$group == 2, "extra"]) ## Two-sample test, wilcox.test example, Hollander & Wolfe (1973), 69f. ## Permeability constants of the human chorioamnion (a placental membrane) ## at term and between 12 to 26 weeks gestational age d.permeabililty <- data.frame(perm = c(0.80, 0.83, 1.89, 1.04, 1.45, 1.38, 1.91, 1.64, 0.73, 1.46, 1.15, 0.88, 0.90, 0.74, 1.21), atterm = rep(1:0, c(10,5)) ) t.test(perm~atterm, data=d.permeabililty) twosamples(perm~atterm, data=d.permeabililty) ## one sample onesample(sleep[sleep$group == 2, "extra"]) ## plot two samples pltwosamples(extra ~ group, data=sleep)