Title: | Parallel Use of Statistical Packages in Teaching |
---|---|
Description: | When teaching statistics, it can often be desirable to uncouple the content from specific software packages. To ease such efforts, the Rosetta Stats website (<https://rosettastats.com>) allows comparing analyses in different packages. This package is the companion to the Rosetta Stats website, aiming to provide functions that produce output that is similar to output from other statistical packages, thereby facilitating 'software-agnostic' teaching of statistics. |
Authors: | Gjalt-Jorn Peters [aut, cre] , Peter Verboon [aut, ctb] , Ron Pat-El [ctb] , Melissa Gordon Wolf [ctb] |
Maintainer: | Gjalt-Jorn Peters <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.3.12 |
Built: | 2024-12-04 07:30:40 UTC |
Source: | CRAN |
Builds model for moderated mediation anaysis using SEM
buildModMedSemModel( xvar, mvars, yvar, xmmod = NULL, mymod = NULL, cmvars = NULL, cyvars = NULL )
buildModMedSemModel( xvar, mvars, yvar, xmmod = NULL, mymod = NULL, cmvars = NULL, cyvars = NULL )
xvar |
independent variable (predictor) |
mvars |
vector of names of mediators |
yvar |
dependent variable |
xmmod |
moderator of a path(s) |
mymod |
moderator of b path(s) |
cmvars |
covariates for predicting the mediators |
cyvars |
covariates for predicting the dependent variable |
lavaan model to be used in moderatedMediationSem
model <- buildModMedSemModel(xvar="procJustice", mvars= c("cynicism"), yvar = "CPB", xmmod = "insecure",mymod = "gender" ,cmvars =c("age"))
model <- buildModMedSemModel(xvar="procJustice", mvars= c("cynicism"), yvar = "CPB", xmmod = "insecure",mymod = "gender" ,cmvars =c("age"))
The cat0 function is to cat what paste0 is to paste; it simply makes concatenating many strings without a separator easier.
cat0(..., sep = "")
cat0(..., sep = "")
... |
The character vector(s) to print; passed to cat. |
sep |
The separator to pass to cat, of course, |
Nothing (invisible NULL
, like cat).
cat0("The first variable is '", names(mtcars)[1], "'.");
cat0("The first variable is '", names(mtcars)[1], "'.");
This function is vectorized.
confIntSD(x, n = NULL, conf.level = 0.95)
confIntSD(x, n = NULL, conf.level = 0.95)
x |
Either a standard deviation, in which case |
n |
The sample size is |
conf.level |
The confidence level |
A vector or matrix.
rosetta::confIntSD(mtcars$mpg); rosetta::confIntSD(c(6, 7), c(32, 32));
rosetta::confIntSD(mtcars$mpg); rosetta::confIntSD(c(6, 7), c(32, 32));
The data are about the attitudes of employees of an organisation that is in the middle of a reorganization. The model predicts that feelings of procedural injustice may lead to cynicism and less trust in the management. This relation may be stronger among employees who are insecure about their job continuation. Cynisicm may lead to contra-productive behaviour (CPB). However, strong personal norms may prevent CPB. Cynicism is expected to increase with age, and men may be more inclined towards CPB than women.
cpbExample
cpbExample
A data frame with 320 rows and 8 variables:
gender participant
age participant
prodedural justice
trust in management
cynicism about the management
contr-productive behaviour
insecure about job continuation
personal norms about CPB
This function produces a cross table, computes Chi Square, and computes the point estimate and confidence interval for Cramer's V.
crossTab(x, y = NULL, conf.level = 0.95, digits = 2, pValueDigits = 3, ...) ## S3 method for class 'crossTab' print(x, digits = x$input$digits, pValueDigits = x$input$pValueDigits, ...) ## S3 method for class 'crossTab' pander(x, digits = x$input$digits, pValueDigits = x$input$pValueDigits, ...)
crossTab(x, y = NULL, conf.level = 0.95, digits = 2, pValueDigits = 3, ...) ## S3 method for class 'crossTab' print(x, digits = x$input$digits, pValueDigits = x$input$pValueDigits, ...) ## S3 method for class 'crossTab' pander(x, digits = x$input$digits, pValueDigits = x$input$pValueDigits, ...)
x |
Either a crosstable to analyse, or one of two vectors to use to generate that crosstable. The vector should be a factor, i.e. a categorical variable identified as such by the 'factor' class). |
y |
If x is a crosstable, y can (and should) be empty. If x is a vector, y must also be a vector. |
conf.level |
Level of confidence for the confidence interval. |
digits |
Minimum number of digits after the decimal point to show in the result. |
pValueDigits |
Minimum number of digits after the decimal point to show in the Chi Square p value in the result. |
... |
Extra arguments to |
The results of ufs::confIntV()
, but also prints the cross
table and the chi square test results.
crossTab(infert$education, infert$induced, samples=50);
crossTab(infert$education, infert$induced, samples=50);
This function provides a number of descriptives about your data, similar to what SPSS's DESCRIPTIVES (often called with DESCR) does.
descr( x, items = names(x), varLabels = NULL, mean = TRUE, meanCI = TRUE, median = TRUE, mode = TRUE, var = TRUE, sd = TRUE, se = FALSE, min = TRUE, max = TRUE, q1 = FALSE, q3 = FALSE, IQR = FALSE, skewness = TRUE, kurtosis = TRUE, dip = TRUE, totalN = TRUE, missingN = TRUE, validN = TRUE, histogram = FALSE, boxplot = FALSE, digits = 2, errorOnFactor = FALSE, convertFactor = FALSE, maxModes = 1, maxPlotCols = 4, t = FALSE, headingLevel = 3, conf.level = 0.95, quantileType = 2 ) rosettaDescr_partial( x, digits = attr(x, "digits"), show = attr(x, "show"), headingLevel = attr(x, "headingLevel"), maxPlotCols = attr(x, "maxPlotCols"), echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaDescr' knit_print( x, digits = attr(x, "digits"), show = attr(x, "show"), headingLevel = attr(x, "headingLevel"), maxPlotCols = attr(x, "maxPlotCols"), echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaDescr' print( x, digits = attr(x, "digits"), show = attr(x, "show"), maxPlotCols = attr(x, "maxPlotCols"), headingLevel = attr(x, "headingLevel"), forceKnitrOutput = FALSE, ... )
descr( x, items = names(x), varLabels = NULL, mean = TRUE, meanCI = TRUE, median = TRUE, mode = TRUE, var = TRUE, sd = TRUE, se = FALSE, min = TRUE, max = TRUE, q1 = FALSE, q3 = FALSE, IQR = FALSE, skewness = TRUE, kurtosis = TRUE, dip = TRUE, totalN = TRUE, missingN = TRUE, validN = TRUE, histogram = FALSE, boxplot = FALSE, digits = 2, errorOnFactor = FALSE, convertFactor = FALSE, maxModes = 1, maxPlotCols = 4, t = FALSE, headingLevel = 3, conf.level = 0.95, quantileType = 2 ) rosettaDescr_partial( x, digits = attr(x, "digits"), show = attr(x, "show"), headingLevel = attr(x, "headingLevel"), maxPlotCols = attr(x, "maxPlotCols"), echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaDescr' knit_print( x, digits = attr(x, "digits"), show = attr(x, "show"), headingLevel = attr(x, "headingLevel"), maxPlotCols = attr(x, "maxPlotCols"), echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaDescr' print( x, digits = attr(x, "digits"), show = attr(x, "show"), maxPlotCols = attr(x, "maxPlotCols"), headingLevel = attr(x, "headingLevel"), forceKnitrOutput = FALSE, ... )
x |
The object to print (i.e. as produced by |
items |
Optionally, if |
varLabels |
Optionally, a named vector with 'pretty labels' to show
for the variables. This has to be a vector of the same length as |
mean , meanCI , median , mode
|
Whether to compute the mean, its
confidence interval, the median, and/or the mode (all logical, so |
var , sd , se
|
Whether to compute the variance, standard deviation, and
standard error (all logical, so |
min , max , q1 , q3 , IQR
|
Whether to compute the minimum, maximum, first and
third quartile, and inter-quartile range (all logical, so |
skewness , kurtosis , dip
|
Whether to compute the skewness, kurtosis and
dip test (all logical, so |
totalN , missingN , validN
|
Whether to show the total sample size, the
number of missing values, and the number of valid (i.e. non-missing) values
(all logical, so |
histogram , boxplot
|
Whether to show a histogram and/or boxplot |
digits |
The number of digits to round the results to when showing them. |
errorOnFactor , convertFactor
|
If |
maxModes |
Maximum number of modes to display: displays "multi" if more than this number of modes if found. |
maxPlotCols |
The maximum number of columns when plotting multiple histograms and/or boxplots. |
t |
Whether to transpose the dataframes when printing them to the screen (this is easier for users relying on screen readers). Note: this functionality has not yet been implemented! |
headingLevel |
The number of hashes to print in front of the headings when printing while knitting |
conf.level |
Confidence of confidence interval around the mean in the central tendency measures. |
quantileType |
The type of quantiles to be used to compute the
interquartile range (IQR). See |
show |
A vector of elements to show in the results, based on the
arguments that activate/deactivate the descriptives (from |
echoPartial |
Whether to show the executed code in the R Markdown
partial ( |
partialFile |
This can be used to specify a custom partial file. The
file will have object |
quiet |
Passed on to |
... |
Any additional arguments are passed to the default print method
by the print method, and to |
forceKnitrOutput |
Force knitr output. |
Note that R (of course) has many similar functions, such as
summary
, psych::describe()
in the excellent
psych::psych package.
The Hartigans' Dip Test may be unfamiliar to users; it is a measure of uni-
vs. multimodality, computed by the dip.test()
function from the
{diptest}
package from the. Depending on the sample size, values over
.025 can be seen as mildly indicative of multimodality, while values over
.05 probably warrant closer inspection (the p-value can be obtained using
that dip.test()
function from {diptest}
; also see Table 1 of
Hartigan & Hartigan (1985) for an indication as to critical values).
A list of dataframes with the requested values.
Gjalt-Jorn Peters
Maintainer: Gjalt-Jorn Peters [email protected]
Hartigan, J. A.; Hartigan, P. M. The Dip Test of Unimodality. Ann. Statist. 13 (1985), no. 1, 70–84. doi:10.1214/aos/1176346577. https://projecteuclid.org/euclid.aos/1176346577.
summary
, [psych::describe()
### Simplest example with default settings descr(mtcars$mpg); ### Also requesting a histogram and boxplot descr(mtcars$mpg, histogram=TRUE, boxplot=TRUE); ### To show the output as Rmd Partial in the viewer rosetta::rosettaDescr_partial( rosetta::descr( mtcars$mpg ) ); ### Multiple variables, including one factor rosetta::rosettaDescr_partial( rosetta::descr( iris ) );
### Simplest example with default settings descr(mtcars$mpg); ### Also requesting a histogram and boxplot descr(mtcars$mpg, histogram=TRUE, boxplot=TRUE); ### To show the output as Rmd Partial in the viewer rosetta::rosettaDescr_partial( rosetta::descr( mtcars$mpg ) ); ### Multiple variables, including one factor rosetta::rosettaDescr_partial( rosetta::descr( iris ) );
Descriptives with confidence intervals
descriptiveCIs( data, items = NULL, itemLabels = NULL, conf.level = 0.95, digits = 2 ) ## S3 method for class 'rosettaDescriptiveCIs' print(x, digits = attr(x, "digits"), forceKnitrOutput = FALSE, ...)
descriptiveCIs( data, items = NULL, itemLabels = NULL, conf.level = 0.95, digits = 2 ) ## S3 method for class 'rosettaDescriptiveCIs' print(x, digits = attr(x, "digits"), forceKnitrOutput = FALSE, ...)
data |
The data frame holding the data, or a vector. |
items |
If supplying a data frame as |
itemLabels |
Optionally, labels to use for the items (optionally, named,
with the names corresponding to the |
conf.level |
The confidence level of the confidence intervals. |
digits |
The number of digits to round the output to. |
x |
The object to print (i.e. the object returned by |
forceKnitrOutput |
Whether to force |
... |
Any additional arguments are passed on to |
A data frame with class rosettaDescriptiveCIs
prepended to allow
printing neatly while knitting to Markdown.
descriptiveCIs(mtcars);
descriptiveCIs(mtcars);
The dlvPlot function produces a dot-violin-line plot, and dlvTheme is the default theme.
dlvTheme(base_size = 11, base_family = "", ...) dlvPlot( dat, x = NULL, y, z = NULL, conf.level = 0.95, jitter = "FALSE", binnedDots = TRUE, binwidth = NULL, error = "lines", dotsize = "density", singleColor = "black", comparisonColors = rosetta::opts$get("dlvPlotCompCols"), densityDotBaseSize = 3, normalDotBaseSize = 1, violinAlpha = 0.2, dotAlpha = 0.4, lineAlpha = 1, connectingLineAlpha = 1, meanDotSize = 5, posDodge = 0.2, errorType = "both", outputFile = NULL, outputWidth = 10, outputHeight = 10, ggsaveParams = list(units = "cm", dpi = 300, type = "cairo") ) ## S3 method for class 'dlvPlot' print(x, ...)
dlvTheme(base_size = 11, base_family = "", ...) dlvPlot( dat, x = NULL, y, z = NULL, conf.level = 0.95, jitter = "FALSE", binnedDots = TRUE, binwidth = NULL, error = "lines", dotsize = "density", singleColor = "black", comparisonColors = rosetta::opts$get("dlvPlotCompCols"), densityDotBaseSize = 3, normalDotBaseSize = 1, violinAlpha = 0.2, dotAlpha = 0.4, lineAlpha = 1, connectingLineAlpha = 1, meanDotSize = 5, posDodge = 0.2, errorType = "both", outputFile = NULL, outputWidth = 10, outputHeight = 10, ggsaveParams = list(units = "cm", dpi = 300, type = "cairo") ) ## S3 method for class 'dlvPlot' print(x, ...)
base_size , base_family , ...
|
Passed on to the ggplot theme_grey() function. |
dat |
The dataframe containing x, y and z. |
x |
Character value with the name of the predictor ('independent') variable, must refer to a categorical variable (i.e. a factor). |
y |
Character value with the name of the critetion ('dependent') variable, must refer to a continuous variable (i.e. a numeric vector). |
z |
Character value with the name of the moderator variable, must refer to a categorical variable (i.e. a factor). |
conf.level |
Confidence of confidence intervals. |
jitter |
Logical value (i.e. TRUE or FALSE) whether or not to jitter individual datapoints. Note that jitter cannot be combined with posDodge (see below). |
binnedDots |
Logical value indicating whether to use binning to display the dots. Overrides jitter and dotsize. |
binwidth |
Numeric value indicating how broadly to bin (larger values is more binning, i.e. combining more dots into one big dot). |
error |
Character value: "none", "lines" or "whiskers"; indicates whether to show the confidence interval as lines with (whiskers) or without (lines) horizontal whiskers or not at all (none) |
dotsize |
Character value: "density" or "normal"; when "density", the size of each dot corresponds to the density of the distribution at that point. |
singleColor |
The color to use when drawing one or more univariate
distributions (i.e. when no |
comparisonColors |
The colors to use when a |
densityDotBaseSize |
Numeric value indicating base size of dots when their size corresponds to the density (bigger = larger dots). |
normalDotBaseSize |
Numeric value indicating base size of dots when their size is fixed (bigger = larger dots). |
violinAlpha |
Numeric value indicating alpha value of violin layer (0 = completely transparent, 1 = completely opaque). |
dotAlpha |
Numeric value indicating alpha value of dot layer (0 = completely transparent, 1 = completely opaque). |
lineAlpha |
Numeric value indicating alpha value of the confidence interval line layer (0 = completely transparent, 1 = completely opaque). |
connectingLineAlpha |
Numeric value indicating alpha value of the layer with the lines connecting the means (0 = completely transparent, 1 = completely opaque). |
meanDotSize |
Numeric value indicating the size of the dot used to indicate the mean in the line layer. |
posDodge |
Numeric value indicating the distance to dodge positions (0 for complete overlap). |
errorType |
If the error is shown using lines, this argument indicates
Whether the errorbars should show the confidence interval
( |
outputFile |
A file to which to save the plot. |
outputWidth , outputHeight
|
Width and height of saved plot (specified in
centimeters by default, see |
ggsaveParams |
Parameters to pass to ggsave when saving the plot. |
This function creates Dot Violin Line plots. One image says more than a thousand words; I suggest you run the example :-)
The behavior of this function depends on the arguments.
If no x and z are provided and y is a character value, dlvPlot produces a univariate plot for the numerical y variable.
If no x and z are provided, and y is c character vector, dlvPlot produces multiple Univariate plots, with variable names determining categories on x-axis and with numerical y variables on y-axis
If both x and y are a character value, and no z is provided, dlvPlot produces a bivariate plot where factor x determines categories on x-axis with numerical variable y on the y-axis (roughly a line plot with a single line)
Finally, if x, y and z are each a character value, dlvPlot produces multivariate plot where factor x determines categories on x-axis, factor z determines the different lines, and with the numerical y variable on the y-axis
An object is returned with the following elements:
dat.raw |
Raw datafile provided when calling dlvPlot |
dat |
Transformed (long) datafile dlvPlot uses |
descr |
Dataframe with extracted descriptives used to plot the mean and confidence intervals |
yRange |
The range of the Y variable used to construct the plot |
plot |
The plot itself |
### Note: the 'not run' is simply because running takes a lot of time, ### but these examples are all safe to run! ## Not run: ### Create simple dataset dat <- data.frame(x1 = factor(rep(c(0,1), 20)), x2 = factor(c(rep(0, 20), rep(1, 20))), y=rep(c(4,5), 20) + rnorm(40)); ### Generate a simple dlvPlot of y dlvPlot(dat, y='y'); ### Now add a predictor dlvPlot(dat, x='x1', y='y'); ### And finally also a moderator: dlvPlot(dat, x='x1', y='y', z='x2'); ### The number of datapoints might be a bit clearer if we jitter dlvPlot(dat, x='x1', y='y', z='x2', jitter=TRUE); ### Although just dodging the density-sized dots might work better dlvPlot(dat, x='x1', y='y', z='x2', posDodge=.3); ## End(Not run)
### Note: the 'not run' is simply because running takes a lot of time, ### but these examples are all safe to run! ## Not run: ### Create simple dataset dat <- data.frame(x1 = factor(rep(c(0,1), 20)), x2 = factor(c(rep(0, 20), rep(1, 20))), y=rep(c(4,5), 20) + rnorm(40)); ### Generate a simple dlvPlot of y dlvPlot(dat, y='y'); ### Now add a predictor dlvPlot(dat, x='x1', y='y'); ### And finally also a moderator: dlvPlot(dat, x='x1', y='y', z='x2'); ### The number of datapoints might be a bit clearer if we jitter dlvPlot(dat, x='x1', y='y', z='x2', jitter=TRUE); ### Although just dodging the density-sized dots might work better dlvPlot(dat, x='x1', y='y', z='x2', posDodge=.3); ## End(Not run)
These functions are one of many R functions enabling users to assess variable descriptives. They have been developed to mimic SPSS' 'EXAMINE' syntax command ('Explore' in the menu) as closely as possible to ease the transition for new R users and facilitate teaching courses where both programs are taught alongside each other.
examine( ..., stem = TRUE, plots = TRUE, extremeValues = 5, qqCI = TRUE, conf.level = 0.95 ) ## S3 method for class 'examine' print(x, ...) ## S3 method for class 'examine' pander( x, headerPrefix = "", headerStyle = "**", secondaryHeaderPrefix = "", secondaryHeaderStyle = "*", ... ) examineBy( ..., by = NULL, stem = TRUE, plots = TRUE, extremeValues = 5, qqCI = TRUE, conf.level = 0.95 ) ## S3 method for class 'examineBy' print(x, ...) ## S3 method for class 'examineBy' pander( x, headerPrefix = "", headerStyle = "**", secondaryHeaderPrefix = "", secondaryHeaderStyle = "*", tertairyHeaderPrefix = "--> ", tertairyHeaderStyle = "", separator = paste0("\n\n", repStr("-", 10), "\n\n"), ... )
examine( ..., stem = TRUE, plots = TRUE, extremeValues = 5, qqCI = TRUE, conf.level = 0.95 ) ## S3 method for class 'examine' print(x, ...) ## S3 method for class 'examine' pander( x, headerPrefix = "", headerStyle = "**", secondaryHeaderPrefix = "", secondaryHeaderStyle = "*", ... ) examineBy( ..., by = NULL, stem = TRUE, plots = TRUE, extremeValues = 5, qqCI = TRUE, conf.level = 0.95 ) ## S3 method for class 'examineBy' print(x, ...) ## S3 method for class 'examineBy' pander( x, headerPrefix = "", headerStyle = "**", secondaryHeaderPrefix = "", secondaryHeaderStyle = "*", tertairyHeaderPrefix = "--> ", tertairyHeaderStyle = "", separator = paste0("\n\n", repStr("-", 10), "\n\n"), ... )
... |
The first argument is a list of variables to provide descriptives for. Because these are the first arguments, the other arguments must be named explicitly so R does not confuse them for something that should be part of the dots. |
stem |
Whether to display a stem and leaf plot. |
plots |
Whether to display the plots generated by the
|
extremeValues |
How many extreme values to show at either end (the highest and lowest values). When set to FALSE (or 0), no extreme values are shown. |
qqCI |
Whether to display confidence intervals in the QQ-plot. |
conf.level |
The level of confidence of the confidence interval. |
x |
The object to print or pander. |
headerPrefix , secondaryHeaderPrefix , tertairyHeaderPrefix
|
Prefixes for the primary, secondary header, and tertairy headers |
headerStyle , secondaryHeaderStyle , tertairyHeaderStyle
|
Characteers to surround the primary, secondary, and tertairy headers with |
by |
A variable by which to split the dataset before calling
|
separator |
Separator for the result blocks. |
This function basically just calls the descr
function,
optionally supplemented with calls to stem
,
ufs::dataShape()
.
A list that is displayed when printed.
Gjalt-Jorn Peters
Maintainer: Gjalt-Jorn Peters [email protected]
### Look at the miles per gallon descriptives: rosetta::examine(mtcars$mpg, stem=FALSE, plots=FALSE); ### Separate for the different number of cylinders: rosetta::examineBy( mtcars$mpg, by=mtcars$cyl, stem=FALSE, plots=FALSE, extremeValues=FALSE );
### Look at the miles per gallon descriptives: rosetta::examine(mtcars$mpg, stem=FALSE, plots=FALSE); ### Separate for the different number of cylinders: rosetta::examineBy( mtcars$mpg, by=mtcars$cyl, stem=FALSE, plots=FALSE, extremeValues=FALSE );
Basic functons to make working with R easier for SPSS users: getData and getDat provide an easy way to load SPSS datafiles, and exportToSPSS to write to a datafile and syntax file that SPSS can import; filterBy and useAll allow easy temporary filtering of rows from the dataframe; mediaan and modus compute the median and mode of ordinal or numeric data.
exportToSPSS( dat, savfile = NULL, datafile = NULL, codefile = NULL, fileEncoding = "UTF-8", newLinesInString = " |n| " ) filterBy( dat, expression, replaceOriginalDataframe = TRUE, envir = parent.frame() ) getData( filename = NULL, file = NULL, errorMessage = "[defaultErrorMessage]", applyRioLabels = TRUE, use.value.labels = FALSE, to.data.frame = TRUE, stringsAsFactors = FALSE, silent = FALSE, ... ) getDat(..., dfName = "dat", backup = TRUE) mediaan(vector) modus(vector) useAll(dat, replaceFilteredDataframe = TRUE)
exportToSPSS( dat, savfile = NULL, datafile = NULL, codefile = NULL, fileEncoding = "UTF-8", newLinesInString = " |n| " ) filterBy( dat, expression, replaceOriginalDataframe = TRUE, envir = parent.frame() ) getData( filename = NULL, file = NULL, errorMessage = "[defaultErrorMessage]", applyRioLabels = TRUE, use.value.labels = FALSE, to.data.frame = TRUE, stringsAsFactors = FALSE, silent = FALSE, ... ) getDat(..., dfName = "dat", backup = TRUE) mediaan(vector) modus(vector) useAll(dat, replaceFilteredDataframe = TRUE)
dat |
Dataframe to process: for filterBy, dataframe to filter rows from; for useAll, dataframe to restore ('unfilter'). |
savfile |
The name of the SPSS format .sav file (alternative for writing a datafile and a codefile). |
datafile |
The name of the data file, a comma separated values file that can be read into SPSS by using the code file. |
codefile |
The name of the code file, the SPSS syntax file that can be used to import the data file. |
fileEncoding |
The encoding to use to write the files. |
newLinesInString |
A string to replace newlines with (SPSS has problems reading newlines). |
expression |
Logical expression determining which rows to keep and which to drop. Can be either a logical vector or a string which is then evaluated. If it's a string, it's evaluated using 'with' to evaluate the expression using the variable names. |
replaceOriginalDataframe |
Whether to also replace the original dataframe in the parent environment. Very messy, but for maximum compatibility with the 'SPSS way of doing things', by default, this is true. After all, people who care about the messiness/inappropriateness of this function wouldn't be using it in the first place :-) |
envir |
The environment where to create the 'backup' of the unfiltered dataframe, for when useAll is called and the filter is deactivated again. |
filename , file
|
It is possible to specify a path and filename to load
here. If not specified, the default R file selection dialogue is shown.
|
errorMessage |
The error message that is shown if the file does not exist or does not have the right extension; "[defaultErrorMessage]" is replaced with a default error message (and can be included in longer messages). |
applyRioLabels |
Whether to apply the labels supplied by Rio. This will make variables that has value labels into factors. |
use.value.labels |
Only useful when reading from SPSS files: whether to read variables with value labels as factors (TRUE) or numeric vectors (FALSE). |
to.data.frame |
Only useful when reading from SPSS files: whether to return a dataframe or not. |
stringsAsFactors |
Whether to read strings as strings (FALSE) or factors (TRUE). |
silent |
Whether to suppress potentially useful information. |
... |
Additional options, passed on to the function used to import the data (which depends on the extension of the file). |
dfName |
The name of the dataframe to create in the parent environment. |
backup |
Whether to backup an object with name |
vector |
For mediaan and modus, the vector for which to find the median or mode. |
replaceFilteredDataframe |
Whether to replace the filtered dataframe passed in the 'dat' argument (see replaceOriginalDataframe). |
getData returns the imported dataframe, with the filename from which it was read stored in the 'filename' attribute.
getDat is a simple wrapper for getData()
which creates a dataframe in
the parent environment, by default with the name 'dat'. Therefore, calling
getDat()
in the console will allow the user to select a file, and the
data from the file will then be read and be available as 'dat'. If an object
with dfName
(i.e. 'dat' by default) already exists, it will be backed
up with a warning. getDat()
therefore returns nothing.
mediaan returns the median, or, in the case of a factor where the median is in between two categories, both categories.
modus returns the mode.
getData() currently can't read from LibreOffice or OpenOffice files. There doesn't seem to be a platform-independent package that allows this. Non-CRAN package ROpenOffice from OmegaHat should be able to do the trick, but fails to install (manual download and installation using http://www.omegahat.org produces "ERROR: dependency 'Rcompression' is not available for package 'ROpenOffice'" - and manual download and installation of RCompression produces "Please define LIB_ZLIB; ERROR: configuration failed for package 'Rcompression'"). If you have any suggestions, please let me know!
## Not run: ### Open a dialogue to read an SPSS file getData(); ## End(Not run) ### Get a median and a mode mediaan(c(1,2,2,3,4,4,5,6,6,6,7)); modus(c(1,2,2,3,4,4,5,6,6,6,7)); ### Create an example dataframe (exampleDat <- data.frame(x=rep(8, 8), y=rep(c(0,1), each=4))); ### Filter it, replacing the original dataframe (filterBy(exampleDat, "y=0")); ### Restore the old dataframe (useAll(exampleDat));
## Not run: ### Open a dialogue to read an SPSS file getData(); ## End(Not run) ### Get a median and a mode mediaan(c(1,2,2,3,4,4,5,6,6,6,7)); modus(c(1,2,2,3,4,4,5,6,6,6,7)); ### Create an example dataframe (exampleDat <- data.frame(x=rep(8, 8), y=rep(c(0,1), each=4))); ### Filter it, replacing the original dataframe (filterBy(exampleDat, "y=0")); ### Restore the old dataframe (useAll(exampleDat));
This is a wrapper for the psych
functions psych::pca()
and psych::fa()
to produce output that it similar to the output produced by jamovi.
factorAnalysis( data, nfactors, items = names(data), rotate = "oblimin", covar = FALSE, na.rm = TRUE, kaiser = 1, loadings = TRUE, summary = FALSE, correlations = FALSE, modelFit = FALSE, eigenValues = FALSE, screePlot = FALSE, residuals = FALSE, itemLabels = items, colorLoadings = FALSE, fm = "minres", digits = 2, headingLevel = 3, ... ) principalComponentAnalysis( data, items, nfactors, rotate = "oblimin", covar = FALSE, na.rm = TRUE, kaiser = 1, loadings = TRUE, summary = FALSE, correlations = FALSE, eigenValues = FALSE, screePlot = FALSE, residuals = FALSE, itemLabels = items, colorLoadings = FALSE, digits = 2, headingLevel = 3, ... ) rosettaDataReduction_partial( x, digits = x$input$digits, headingLevel = x$input$headingLevel, echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaDataReduction' knit_print( x, digits = x$input$digits, headingLevel = x$input$headingLevel, echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaDataReduction' print( x, digits = x$input$digits, headingLevel = x$input$headingLevel, forceKnitrOutput = FALSE, ... )
factorAnalysis( data, nfactors, items = names(data), rotate = "oblimin", covar = FALSE, na.rm = TRUE, kaiser = 1, loadings = TRUE, summary = FALSE, correlations = FALSE, modelFit = FALSE, eigenValues = FALSE, screePlot = FALSE, residuals = FALSE, itemLabels = items, colorLoadings = FALSE, fm = "minres", digits = 2, headingLevel = 3, ... ) principalComponentAnalysis( data, items, nfactors, rotate = "oblimin", covar = FALSE, na.rm = TRUE, kaiser = 1, loadings = TRUE, summary = FALSE, correlations = FALSE, eigenValues = FALSE, screePlot = FALSE, residuals = FALSE, itemLabels = items, colorLoadings = FALSE, digits = 2, headingLevel = 3, ... ) rosettaDataReduction_partial( x, digits = x$input$digits, headingLevel = x$input$headingLevel, echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaDataReduction' knit_print( x, digits = x$input$digits, headingLevel = x$input$headingLevel, echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaDataReduction' print( x, digits = x$input$digits, headingLevel = x$input$headingLevel, forceKnitrOutput = FALSE, ... )
data |
The data frame that contains the |
nfactors |
The number of factors to extract, or ' |
items |
The items to analyse; if not specified, all variables
in |
rotate |
Which rotation to use; see |
covar |
Whether to analyse the correlation matrix ( |
na.rm |
Whether to first remove all cases with missing values. |
kaiser |
The minimum eigenvalue when applying the Kaiser criterion (see
|
loadings |
Whether to display the component or factor loadings. |
summary |
Whether to display the factor or component summary. |
correlations |
Whether to display the correlations between factors of components. |
modelFit |
Whether to display the model fit Only for EFA). |
eigenValues |
Whether to display the eigen values. |
screePlot |
Whether to display the scree plot. |
residuals |
Whether to display the matrix with residuals. |
itemLabels |
Optionally, labels to use for the items (optionally, named,
with the names corresponding to the |
colorLoadings |
Whether, when producing an Rmd partial (i.e. when
calling the command while knitting) to colour the cells using
|
fm |
The method to use for the factor analysis: ' |
digits |
The number of digits to round to. |
headingLevel |
The number of hashes to print in front of the headings when printing while knitting |
... |
Any additional arguments are passed to |
x |
The object to print. |
echoPartial |
Whether to show the executed code in the R Markdown
partial ( |
partialFile |
This can be used to specify a custom partial file. The
file will have object |
quiet |
Passed on to |
forceKnitrOutput |
Force knitr output. |
The code in these functions uses parts of the code in jamovi, written by Jonathon Love and Ravi Selker.
An object with the object resulting from the call to the
psych
functions and some extracted information that will be printed.
### Load example dataset data("pp15", package="rosetta"); ### Get variable names with expected ### effects of a high dose of MDMA items <- grep( "highDose_AttBeliefs_", names(pp15), value=TRUE ); ### Do a factor analysis rosetta::factorAnalysis( data = pp15, items = items, nfactors = "eigen", scree = TRUE ); if (FALSE) { ### To get more output, show the ### output as Rmd Partial in the viewer, ### and color/size the factor loadings rosetta::rosettaDataReduction_partial( rosetta::factorAnalysis( data = pp15, items = items, nfactors = "eigen", summary = TRUE, correlations = TRUE, colorLoadings = TRUE ) ); }
### Load example dataset data("pp15", package="rosetta"); ### Get variable names with expected ### effects of a high dose of MDMA items <- grep( "highDose_AttBeliefs_", names(pp15), value=TRUE ); ### Do a factor analysis rosetta::factorAnalysis( data = pp15, items = items, nfactors = "eigen", scree = TRUE ); if (FALSE) { ### To get more output, show the ### output as Rmd Partial in the viewer, ### and color/size the factor loadings rosetta::rosettaDataReduction_partial( rosetta::factorAnalysis( data = pp15, items = items, nfactors = "eigen", summary = TRUE, correlations = TRUE, colorLoadings = TRUE ) ); }
Factor Analysis
factorAnalysisjmv( data, items, nFactorMethod = "eigen", nFactors = 1, minEigen = 1, extraction = "minres", rotation = "oblimin", colorLoadings = TRUE, screePlot = FALSE, eigen = FALSE, factorCor = FALSE, factorSummary = FALSE, modelFit = FALSE )
factorAnalysisjmv( data, items, nFactorMethod = "eigen", nFactors = 1, minEigen = 1, extraction = "minres", rotation = "oblimin", colorLoadings = TRUE, screePlot = FALSE, eigen = FALSE, factorCor = FALSE, factorSummary = FALSE, modelFit = FALSE )
data |
the data as a data frame |
items |
a vector of strings naming the variables of interest in
|
nFactorMethod |
. |
nFactors |
. |
minEigen |
. |
extraction |
. |
rotation |
. |
colorLoadings |
. |
screePlot |
. |
eigen |
. |
factorCor |
. |
factorSummary |
. |
modelFit |
. |
A results object containing:
results$loadings |
a html | ||||
results$factorStats$factorSummary |
a table | ||||
results$factorStats$factorCor |
a table | ||||
results$modelFit$fit |
a table | ||||
results$eigen$initEigen |
a table | ||||
results$eigen$screePlot |
an image | ||||
This function is meant as a userfriendly wrapper to approximate the way analysis of variance is done in SPSS.
fanova( data, y, between = NULL, covar = NULL, withinReference = 1, betweenReference = NULL, withinNames = NULL, plot = FALSE, levene = FALSE, digits = 2, contrast = NULL ) ## S3 method for class 'fanova' print(x, digits = x$input$digits, ...)
fanova( data, y, between = NULL, covar = NULL, withinReference = 1, betweenReference = NULL, withinNames = NULL, plot = FALSE, levene = FALSE, digits = 2, contrast = NULL ) ## S3 method for class 'fanova' print(x, digits = x$input$digits, ...)
data |
The dataset containing the variables to analyse. |
y |
The dependent variable. For oneway anova, factorial anova, or
ancova, this is the name of a variable in dataframe |
between |
A vector with the variables name(s) of the between subjects factor(s). |
covar |
A vector with the variables name(s) of the covariate(s). |
withinReference |
Number of reference category (variable) for within subjects treatment contrast (dummy). |
betweenReference |
Name of reference category for between subject factor in RM anova. |
withinNames |
Names of within subjects categories (dependent variables). |
plot |
Whether to produce a plot. Note that a plot is only produced for oneway and twoway anova and oneway repeated measures designs: if covariates or more than two between-subjects factors are specified, not plot is produced. For twoway anova designs, the second predictor is plotted as moderator (and the first predictor is plotted on the x axis). |
levene |
Whether to show Levene's test for equality of variances (using
|
digits |
Number of digits (actually: decimals) to use when printing results. The p-value is printed with one extra digit. |
contrast |
This functionality has been implemented for repeated measures only. |
x |
The object to print (i.e. as produced by |
... |
Any additional arguments are ignored. |
This wrapper uses oneway
and lm
and
lmer
in combination with car
's Anova
function to conduct the analysis of variance.
Mainly, this function prints its results, but it also returns them in an object containing three lists:
input |
The arguments specified when calling the function |
intermediate |
Intermediat objects and values |
output |
The results such as the plot. |
Gjalt-Jorn Peters and Peter Verboon
Maintainer: Gjalt-Jorn Peters [email protected]
regr
and logRegr
for similar functions
for linear and logistic regression and oneway
,
lm
, lmer
and Anova
for the
functions used behind the scenes.
### Oneway anova with a plot fanova(dat=mtcars, y='mpg', between='cyl', plot=TRUE); ### Factorial anova fanova(dat=mtcars, y='mpg', between=c('vs', 'am'), plot=TRUE); ### Ancova fanova(dat=mtcars, y='mpg', between=c('vs', 'am'), covar='hp'); ### Don't run these examples to not take too much time during testing ### for CRAN ## Not run: ### Repeated measures anova; first generate datafile dat <- mtcars[, c('am', 'drat', 'wt')]; names(dat) <- c('factor', 't0_dependentVar' ,'t1_dependentVar'); dat$factor <- factor(dat$factor); ### Then do the repeated measures anova fanova(dat, y=c('t0_dependentVar' ,'t1_dependentVar'), between='factor', plot=TRUE); ## End(Not run)
### Oneway anova with a plot fanova(dat=mtcars, y='mpg', between='cyl', plot=TRUE); ### Factorial anova fanova(dat=mtcars, y='mpg', between=c('vs', 'am'), plot=TRUE); ### Ancova fanova(dat=mtcars, y='mpg', between=c('vs', 'am'), covar='hp'); ### Don't run these examples to not take too much time during testing ### for CRAN ## Not run: ### Repeated measures anova; first generate datafile dat <- mtcars[, c('am', 'drat', 'wt')]; names(dat) <- c('factor', 't0_dependentVar' ,'t1_dependentVar'); dat$factor <- factor(dat$factor); ### Then do the repeated measures anova fanova(dat, y=c('t0_dependentVar' ,'t1_dependentVar'), between='factor', plot=TRUE); ## End(Not run)
Pretty formatting of p values
formatPvalue(values, digits = 3, spaces = TRUE, includeP = TRUE)
formatPvalue(values, digits = 3, spaces = TRUE, includeP = TRUE)
values |
The p-values to format. |
digits |
The number of digits to round to. Numbers smaller than this number will be shown as <.001 or <.0001 etc. |
spaces |
Whether to include spaces between symbols, operators, and digits. |
includeP |
Whether to include the 'p' and '='-symbol in the results (the '<' symbol is always included). |
A formatted P value, roughly according to APA style guidelines. This means that the noZero function is used to remove the zero preceding the decimal point, and p values that would round to zero given the requested number of digits are shown as e.g. p<.001.
formatCI()
, formatR()
, noZero()
formatPvalue(cor.test(mtcars$mpg, mtcars$disp)$p.value); formatPvalue(cor.test(mtcars$drat, mtcars$qsec)$p.value);
formatPvalue(cor.test(mtcars$mpg, mtcars$disp)$p.value); formatPvalue(cor.test(mtcars$drat, mtcars$qsec)$p.value);
Pretty formatting of correlation coefficients
formatR(r, digits = 2)
formatR(r, digits = 2)
r |
The Pearson correlation to format. |
digits |
The number of digits to round to. |
The formatted correlation.
noZero()
, formatCI()
, formatPvalue()
formatR(cor(mtcars$mpg, mtcars$disp));
formatR(cor(mtcars$mpg, mtcars$disp));
Function to show frequencies in a manner similar to what SPSS' "FREQUENCIES"
command does. Note that frequency
is an alias for freq
.
freq( vector, digits = 1, nsmall = 1, transposed = FALSE, round = 1, plot = FALSE, plotTheme = ggplot2::theme_bw() ) ## S3 method for class 'freq' print( x, digits = x$input$digits, nsmall = x$input$nsmall, transposed = x$input$transposed, ... ) ## S3 method for class 'freq' pander(x, ...) frequencies( ..., digits = 1, nsmall = 1, transposed = FALSE, round = 1, plot = FALSE, plotTheme = ggplot2::theme_bw() ) ## S3 method for class 'frequencies' print(x, ...) ## S3 method for class 'frequencies' pander(x, prefix = "###", ...)
freq( vector, digits = 1, nsmall = 1, transposed = FALSE, round = 1, plot = FALSE, plotTheme = ggplot2::theme_bw() ) ## S3 method for class 'freq' print( x, digits = x$input$digits, nsmall = x$input$nsmall, transposed = x$input$transposed, ... ) ## S3 method for class 'freq' pander(x, ...) frequencies( ..., digits = 1, nsmall = 1, transposed = FALSE, round = 1, plot = FALSE, plotTheme = ggplot2::theme_bw() ) ## S3 method for class 'frequencies' print(x, ...) ## S3 method for class 'frequencies' pander(x, prefix = "###", ...)
vector |
A vector of values to compute frequencies for. |
digits |
Minimum number of significant digits to show in result. |
nsmall |
Minimum number of digits after the decimal point to show in the result. |
transposed |
Whether to transpose the results when printing them (this can be useful for blind users). |
round |
Number of digits to round the results to (can be used in conjunction with digits to determine format of results). |
plot |
If true, a histogram is shown of the variable. |
plotTheme |
The ggplot2 theme to use. |
x |
The |
... |
For |
prefix |
The prefix to use when printing |
An object with several elements, the most notable of which is:
dat |
A dataframe with the frequencies |
For frequencies
, these objects are in a list of their own.
### Create factor vector ourFactor <- factor(mtcars$gear, levels=c(3,4,5), labels=c("three", "four", "five")); ### Add some missing values factorWithMissings <- ourFactor; factorWithMissings[10] <- factorWithMissings[20] <- NA; ### Show frequencies freq(ourFactor); freq(factorWithMissings); ### ... Or for all of them at one frequencies(ourFactor, factorWithMissings);
### Create factor vector ourFactor <- factor(mtcars$gear, levels=c(3,4,5), labels=c("three", "four", "five")); ### Add some missing values factorWithMissings <- ourFactor; factorWithMissings[10] <- factorWithMissings[20] <- NA; ### Show frequencies freq(ourFactor); freq(factorWithMissings); ### ... Or for all of them at one frequencies(ourFactor, factorWithMissings);
Frequencies
freqjmv(data, vector)
freqjmv(data, vector)
data |
. |
vector |
. |
A results object containing:
results$table |
a table | ||||
Tables can be converted to data frames with asDF
or as.data.frame
. For example:
results$table$asDF
as.data.frame(results$table)
Analyze moderated mediation model using SEM
gemm( data = NULL, xvar, mvars, yvar, xmmod = NULL, mymod = NULL, cmvars = NULL, cyvars = NULL, estMethod = "bootstrap", nboot = 1000 )
gemm( data = NULL, xvar, mvars, yvar, xmmod = NULL, mymod = NULL, cmvars = NULL, cyvars = NULL, estMethod = "bootstrap", nboot = 1000 )
data |
data frame |
xvar |
predictor variable, must be either numerical or dichotomous |
mvars |
vector of names of mediator variables |
yvar |
dependent variable, must be numerical |
xmmod |
moderator of effect predictor on mediators, must be either numerical or dichotomous |
mymod |
moderator of effect mediators on dependent variable, must be either numerical or dichotomous |
cmvars |
covariates for mediators |
cyvars |
covariates for dependent variable |
estMethod |
estimation of standard errors method, bootstrap is default |
nboot |
number of bootstrap samples |
gemm object
## Not run: data("cpbExample") res <- gemm(dat = cpbExample, xvar="procJustice", mvars= c("cynicism","trust"), yvar = "CPB", nboot=500) print(res) ## End(Not run)
## Not run: data("cpbExample") res <- gemm(dat = cpbExample, xvar="procJustice", mvars= c("cynicism","trust"), yvar = "CPB", nboot=500) print(res) ## End(Not run)
This function provides a simple interface to create a ggplot2::ggplot()
bar chart.
ggBarChart(vector, plotTheme = ggplot2::theme_bw(), ...)
ggBarChart(vector, plotTheme = ggplot2::theme_bw(), ...)
vector |
The vector to display in the bar chart. |
plotTheme |
The theme to apply. |
... |
And additional arguments are passed to |
A ggplot2::ggplot()
plot is returned.
Gjalt-Jorn Peters
Maintainer: Gjalt-Jorn Peters [email protected]
rosetta::ggBarChart(mtcars$cyl);
rosetta::ggBarChart(mtcars$cyl);
This function provides a simple interface to create a ggplot
box plot, organising different boxplots by levels of a factor is desired,
and showing row numbers of outliers.
ggBoxplot( dat, y = NULL, x = NULL, labelOutliers = TRUE, outlierColor = "red", theme = ggplot2::theme_bw(), ... )
ggBoxplot( dat, y = NULL, x = NULL, labelOutliers = TRUE, outlierColor = "red", theme = ggplot2::theme_bw(), ... )
dat |
Either a vector of values (to display in the box plot) or a dataframe containing variables to display in the box plot. |
y |
If |
x |
If |
labelOutliers |
Whether or not to label outliers. |
outlierColor |
If labeling outliers, this is the color to use. |
theme |
The theme to use for the box plot. |
... |
Any additional arguments will be passed to
|
This function is based on JasonAizkalns' answer to a question on Stack Exchange (Cross Validated; see https://stackoverflow.com/questions/33524669/labeling-outliers-of-boxplots-in-r).
A ggplot
plot is returned.
Jason Aizkalns; implemented in this package (and tweaked a bit) by Gjalt-Jorn Peters.
Maintainer: Gjalt-Jorn Peters [email protected]
### A box plot for miles per gallon in the mtcars dataset: ggBoxplot(mtcars$mpg); ### And separate for each level of 'cyl' (number of cylinder): ggBoxplot(mtcars, y='mpg', x='cyl');
### A box plot for miles per gallon in the mtcars dataset: ggBoxplot(mtcars$mpg); ### And separate for each level of 'cyl' (number of cylinder): ggBoxplot(mtcars, y='mpg', x='cyl');
This function creates a qq-plot with a confidence interval.
ggqq( x, distribution = "norm", ..., ci = TRUE, line.estimate = NULL, conf.level = 0.95, sampleSizeOverride = NULL, observedOnX = TRUE, scaleExpected = TRUE, theoryLab = "Theoretical quantiles", observeLab = "Observed quantiles", theme = ggplot2::theme_bw() )
ggqq( x, distribution = "norm", ..., ci = TRUE, line.estimate = NULL, conf.level = 0.95, sampleSizeOverride = NULL, observedOnX = TRUE, scaleExpected = TRUE, theoryLab = "Theoretical quantiles", observeLab = "Observed quantiles", theme = ggplot2::theme_bw() )
x |
A vector containing the values to plot. |
distribution |
The distribution to (a 'd' and 'q' are prepended, and
the resulting functions are used, e.g. |
... |
Any additional arguments are passed to the quantile function
(e.g. |
ci |
Whether to show the confidence interval. |
line.estimate |
Whether to show the line showing the match with the specified distribution (e.g. the normal distribution). |
conf.level |
THe confidence of the confidence leven arround the estimate for the specified distribtion. |
sampleSizeOverride |
It can be desirable to get the confidence
intervals for a different sample size (when the sample size is very large,
for example, such as when this plot is generated by the function
|
observedOnX |
Whether to plot the observed values (if |
scaleExpected |
Whether the scale the expected values to match the scale of the variable. This option is provided to be able to mimic SPSS' Q-Q plots. |
theoryLab |
The label for the theoretically expected values (on the Y axis by default). |
observeLab |
The label for the observed values (on the Y axis by default). |
theme |
The theme to use. |
This is strongly based on the answer by user Floo0 to a Stack Overflow
question at Stack Exchange (see
https://stackoverflow.com/questions/4357031/qqnorm-and-qqline-in-ggplot2/27191036#27191036),
also posted at GitHub (see
https://gist.github.com/rentrop/d39a8406ad8af2a1066c). That code is in
turn based on the qqPlot()
function from the car
package.
A ggplot
plot is returned.
John Fox and Floo0; implemented in this package (and tweaked a bit) by Gjalt-Jorn Peters.
Maintainer: Gjalt-Jorn Peters [email protected]
ggqq(mtcars$mpg);
ggqq(mtcars$mpg);
This function provides a simple interface to create a ggplot2::ggplot()
bar chart.
ggScatterPlot( x, y, jitter = TRUE, size = 3, alpha = 0.66, shape = 16, color = "black", fill = "black", stroke = 1, plotTheme = ggplot2::theme_bw(), ... )
ggScatterPlot( x, y, jitter = TRUE, size = 3, alpha = 0.66, shape = 16, color = "black", fill = "black", stroke = 1, plotTheme = ggplot2::theme_bw(), ... )
x , y
|
The vectors to display in the scatter plot. Alternatively,
|
jitter |
Whether to jitter the points ( |
size , alpha , shape , color , fill , stroke
|
Quick way to set the aesthetics. |
plotTheme |
The theme to apply. |
... |
And additional arguments are passed to |
A ggplot2::ggplot()
plot is returned.
rosetta::ggScatterPlot(mtcars$hp, mtcars$mpg);
rosetta::ggScatterPlot(mtcars$hp, mtcars$mpg);
Simple function to create a histogram
histogram( vector, bins = NULL, theme = ggplot2::theme_bw(), xLabel = NULL, yLabel = "Count" )
histogram( vector, bins = NULL, theme = ggplot2::theme_bw(), xLabel = NULL, yLabel = "Count" )
vector |
A variable or vector. |
bins |
The number of bins; when 0, either the number of unique
values in |
theme |
The ggplot2 theme to use. |
xLabel , yLabel
|
Labels for x and y axes; variable name is used for x axis if no label is specified. |
A ggplot2 plot.
rosetta::histogram(mtcars$mpg);
rosetta::histogram(mtcars$mpg);
This function is meant as a userfriendly wrapper to approximate the way logistic regression is done in SPSS.
logRegr( formula, data = NULL, conf.level = 0.95, digits = 2, predictGroupValue = NULL, comparisonGroupValue = NULL, pvalueDigits = 3, crossTabs = TRUE, oddsRatios = TRUE, plot = FALSE, collinearity = FALSE, env = parent.frame(), predictionColor = rosetta::opts$get("viridis3")[3], predictionAlpha = 0.5, predictionSize = 2, dataColor = rosetta::opts$get("viridis3")[1], dataAlpha = 0.33, dataSize = 2, observedMeansColor = rosetta::opts$get("viridis3")[2], binObservedMeans = 7, observedMeansSize = 2, observedMeansWidth = NULL, observedMeansAlpha = 0.5, theme = ggplot2::theme_bw(), headingLevel = 3 ) rosettaLogRegr_partial( x, digits = x$input$digits, pvalueDigits = x$input$pvalueDigits, headingLevel = x$input$headingLevel, echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaLogRegr' knit_print( x, digits = x$input$digits, headingLevel = x$input$headingLevel, pvalueDigits = x$input$pvalueDigits, echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaLogRegr' print( x, digits = x$input$digits, pvalueDigits = x$input$pvalueDigits, headingLevel = x$input$headingLevel, forceKnitrOutput = FALSE, ... )
logRegr( formula, data = NULL, conf.level = 0.95, digits = 2, predictGroupValue = NULL, comparisonGroupValue = NULL, pvalueDigits = 3, crossTabs = TRUE, oddsRatios = TRUE, plot = FALSE, collinearity = FALSE, env = parent.frame(), predictionColor = rosetta::opts$get("viridis3")[3], predictionAlpha = 0.5, predictionSize = 2, dataColor = rosetta::opts$get("viridis3")[1], dataAlpha = 0.33, dataSize = 2, observedMeansColor = rosetta::opts$get("viridis3")[2], binObservedMeans = 7, observedMeansSize = 2, observedMeansWidth = NULL, observedMeansAlpha = 0.5, theme = ggplot2::theme_bw(), headingLevel = 3 ) rosettaLogRegr_partial( x, digits = x$input$digits, pvalueDigits = x$input$pvalueDigits, headingLevel = x$input$headingLevel, echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaLogRegr' knit_print( x, digits = x$input$digits, headingLevel = x$input$headingLevel, pvalueDigits = x$input$pvalueDigits, echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaLogRegr' print( x, digits = x$input$digits, pvalueDigits = x$input$pvalueDigits, headingLevel = x$input$headingLevel, forceKnitrOutput = FALSE, ... )
formula |
The formula, specified in the same way as for
|
data |
Optionally, a dataset containing the variables in the formula
(if not specified, the variables must exist in the environment specified in
|
conf.level |
The confidence level for the confidence intervals. |
digits |
The number of digits used when printing the results. |
predictGroupValue , comparisonGroupValue
|
Can optionally be used to set the value to predict and the value to compare with. |
pvalueDigits |
The number of digits used when printing the p-values. |
crossTabs |
Whether to show cross tabulations of the correct predictions for the null model and the tested model, as well as the percentage of correct predictions. |
oddsRatios |
Whether to also present the regression coefficients
as odds ratios (i.e. simply after a call to |
plot |
Whether to display the plot. |
collinearity |
Whether to show collinearity diagnostics. |
env |
If no dataframe is specified in |
predictionColor , dataColor , observedMeansColor
|
The color of, respectively, the line and confidence interval showing the prediction; the points representing the observed data points; and the means based on the observed data. |
predictionAlpha , dataAlpha , observedMeansAlpha
|
The alpha of, respectively, the confidence interval of the prediction; the points representing the observed data points; and the means based on the observed data (set to 0 to hide an element). |
predictionSize , dataSize , observedMeansSize
|
The size of, respectively, the line of the prediction; the points representing the observed data points; and the means based on the observed data (set to 0 to hide an element). |
binObservedMeans |
Whether to bin the observed means; either FALSE or a single numeric value specifying the number of bins. |
observedMeansWidth |
The width of the lines of the observed means. If
not specified (i.e. |
theme |
The theme used to display the plot. |
headingLevel |
The number of hashes to print in front of the headings |
x |
The object to print (i.e. as produced by |
echoPartial |
Whether to show the executed code in the R Markdown
partial ( |
partialFile |
This can be used to specify a custom partial file. The
file will have object |
quiet |
Passed on to |
... |
Any additional arguments are passed to the default print method
by the print method, and to |
forceKnitrOutput |
Force knitr output. |
Mainly, this function prints its results, but it also returns them in an object containing three lists:
input |
The arguments specified when calling the function |
intermediate |
Intermediat objects and values |
output |
The results, such as the plot, the cross tables, and the coefficients. |
Ron Pat-El & Gjalt-Jorn Peters (both while at the Open University of the Netherlands)
Maintainer: Gjalt-Jorn Peters [email protected]
regr
and fanova
for similar functions
for linear regression and analysis of variance and stats::glm()
for the
regular interface for logistic regression.
### Simplest way to call logRegr rosetta::logRegr(data=mtcars, formula = vs ~ mpg); ### Also ordering a plot rosetta::logRegr( data=mtcars, formula = vs ~ mpg, plot=TRUE ); ### Only use five bins rosetta::logRegr( data=mtcars, formula = vs ~ mpg, plot=TRUE, binObservedMeans=5 ); ## Not run: ### Mimic output that would be obtained ### when calling from an R Markdown file rosetta::rosettaLogRegr_partial( rosetta::logRegr( data=mtcars, formula = vs ~ mpg, plot=TRUE ) ); ## End(Not run)
### Simplest way to call logRegr rosetta::logRegr(data=mtcars, formula = vs ~ mpg); ### Also ordering a plot rosetta::logRegr( data=mtcars, formula = vs ~ mpg, plot=TRUE ); ### Only use five bins rosetta::logRegr( data=mtcars, formula = vs ~ mpg, plot=TRUE, binObservedMeans=5 ); ## Not run: ### Mimic output that would be obtained ### when calling from an R Markdown file rosetta::rosettaLogRegr_partial( rosetta::logRegr( data=mtcars, formula = vs ~ mpg, plot=TRUE ) ); ## End(Not run)
The meanDiff function compares the means between two groups. It computes Cohen's d, the unbiased estimate of Cohen's d (Hedges' g), and performs a t-test. It also shows the achieved power, and, more usefully, the power to detect small, medium, and large effects.
meanDiff( x, y = NULL, paired = FALSE, r.prepost = NULL, var.equal = "test", conf.level = 0.95, plot = FALSE, digits = 2, envir = parent.frame() ) ## S3 method for class 'meanDiff' print(x, digits = x$digits, powerDigits = x$digits + 2, ...) ## S3 method for class 'meanDiff' pander(x, digits = x$digits, powerDigits = x$digits + 2, ...)
meanDiff( x, y = NULL, paired = FALSE, r.prepost = NULL, var.equal = "test", conf.level = 0.95, plot = FALSE, digits = 2, envir = parent.frame() ) ## S3 method for class 'meanDiff' print(x, digits = x$digits, powerDigits = x$digits + 2, ...) ## S3 method for class 'meanDiff' pander(x, digits = x$digits, powerDigits = x$digits + 2, ...)
x |
Dichotomous factor: variable 1; can also be a formula of the form y ~ x, where x must be a factor with two levels (i.e. dichotomous). |
y |
Numeric vector: variable 2; can be empty if x is a formula. |
paired |
Boolean; are x & y independent or dependent? Note that if x & y are dependent, they need to have the same length. |
r.prepost |
Correlation between the pre- and post-test in the case of a paired samples t-test. This is required to compute Cohen's d using the formula on page 29 of Borenstein et al. (2009). If NULL, the correlation is simply computed from the provided scores (but of course it will then be lower if these is an effect - this will lead to an underestimate of the within-groups variance, and therefore, of the standard error of Cohen's d, and therefore, to confidence intervals that are too narrow (too liberal). Also, of course, when using this data to compute the within-groups correlation, random variations will also impact that correlation, which means that confidence intervals may in practice deviate from the null hypothesis significance testing p-value in either direction (i.e. the p-value may indicate a significant association while the confidence interval contains 0, or the other way around). Therefore, if the test-retest correlation of the relevant measure is known, please provide this here to enable computation of accurate confidence intervals. |
var.equal |
String; only relevant if x & y are independent; can be "test" (default; test whether x & y have different variances), "no" (assume x & y have different variances; see the Warning below!), or "yes" (assume x & y have the same variance) |
conf.level |
Confidence of confidence intervals you want. |
plot |
Whether to print a dlvPlot. |
digits |
With what precision you want the results to print. |
envir |
The environment where to search for the variables (useful when calling meanDiff from a function where the vectors are defined in that functions environment). |
powerDigits |
With what precision you want the power to print. |
... |
Additional arguments are passen on to the |
This function uses the formulae from Borenstein, Hedges, Higgins & Rothstein (2009) (pages 25-32).
An object is returned with the following elements:
variables |
Input variables |
groups |
Levels of the x variable, the dichotomous factor |
ci.confidence |
Confidence of confidence intervals |
digits |
Number of digits for output |
x |
Values of dependent variable in first group |
y |
Values of dependent variable in second group |
type |
Type of t-test (independent or dependent, equal variances or not) |
n |
Sample sizes of the two groups |
mean |
Means of the two groups |
sd |
Standard deviations of the two groups |
objects |
Objects used; the t-test and optionally the test for equal variances |
variance |
Variance of the difference score |
meanDiff |
Difference between the means |
meanDiff.d |
Cohen's d |
meanDiff.d.var |
Variance of Cohen's d |
meanDiff.d.se |
Standard error of Cohen's d |
meanDiff.J |
Correction for Cohen's d to get to the unbiased Hedges g |
power |
Achieved power with current effect size and sample size |
power.small |
Power to detect small effects with current sample size |
power.medium |
Power to detect medium effects with current sample size |
power.largel |
Power to detect large effects with current sample size |
meanDiff.g |
Hedges' g |
meanDiff.g.var |
Variance of Hedges' g |
meanDiff.g.se |
Standard error of Hedges' g |
ci.usedZ |
Z value used to compute confidence intervals |
meanDiff.d.ci.lower |
Lower bound of confidence interval around Cohen's d |
meanDiff.d.ci.upper |
Upper bound of confidence interval around Cohen's d |
meanDiff.g.ci.lower |
Lower bound of confidence interval around Hedges' g |
meanDiff.g.ci.upper |
Upper bound of confidence interval around Hedges' g |
meanDiff.ci.lower |
Lower bound of confidence interval around raw mean |
meanDiff.ci.upper |
Upper bound of confidence interval around raw mean |
t |
Student t value for Null Hypothesis Significance Testing |
df |
Degrees of freedom for t value |
p |
p-value corresponding to t value |
Note that when different variances are assumed for the t-test (i.e. the null-hypothesis test), the values of Cohen's d are still based on the assumption that the variance is equal. In this case, the confidence interval might, for example, not contain zero even though the NHST has a non-significant p-value (the reverse can probably happen, too).
Borenstein, M., Hedges, L. V., Higgins, J. P., & Rothstein, H. R. (2011). Introduction to meta-analysis. John Wiley & Sons.
### Create simple dataset dat <- PlantGrowth[1:20,]; ### Remove third level from group factor dat$group <- factor(dat$group); ### Compute mean difference and show it meanDiff(dat$weight ~ dat$group); ### Look at second treatment dat <- rbind(PlantGrowth[1:10,], PlantGrowth[21:30,]); ### Remove third level from group factor dat$group <- factor(dat$group); ### Compute mean difference and show it meanDiff(x=dat$group, y=dat$weight);
### Create simple dataset dat <- PlantGrowth[1:20,]; ### Remove third level from group factor dat$group <- factor(dat$group); ### Compute mean difference and show it meanDiff(dat$weight ~ dat$group); ### Look at second treatment dat <- rbind(PlantGrowth[1:10,], PlantGrowth[21:30,]); ### Remove third level from group factor dat$group <- factor(dat$group); ### Compute mean difference and show it meanDiff(x=dat$group, y=dat$weight);
The meanDiff.multi function compares many means for many groups. It presents the results in a dataframe summarizing all relevant information, and produces plot showing the confidence intervals for the effect sizes for each predictor (i.e. dichotomous variable). Like meanDiff, it computes Cohen's d, the unbiased estimate of Cohen's d (Hedges' g), and performs a t-test. It also shows the achieved power, and, more usefully, the power to detect small, medium, and large effects.
meanDiff.multi( dat, y, x = NULL, var.equal = "yes", conf.level = 0.95, digits = 2, orientation = "vertical", zeroLineColor = "grey", zeroLineSize = 1.2, envir = parent.frame() ) ## S3 method for class 'meanDiff.multi' print(x, digits = x$digits, powerDigits = x$digits + 2, ...)
meanDiff.multi( dat, y, x = NULL, var.equal = "yes", conf.level = 0.95, digits = 2, orientation = "vertical", zeroLineColor = "grey", zeroLineSize = 1.2, envir = parent.frame() ) ## S3 method for class 'meanDiff.multi' print(x, digits = x$digits, powerDigits = x$digits + 2, ...)
dat |
The dataframe containing the variables involved in the mean tests. |
y |
Character vector containing the list of interval variables to include in the tests. |
x |
Character vector containing the list of the dichotomous variables to include in the tests. If x is empty, paired samples t-tests will be conducted. |
var.equal |
String; only relevant if x & y are independent; can be "test" (default; test whether x & y have different variances), "no" (assume x & y have different variances; see the Warning below!), or "yes" (assume x & y have the same variance) |
conf.level |
Confidence of confidence intervals you want. |
digits |
With what precision you want the results to print. |
orientation |
Whether to plot the effect size confidence intervals vertically (like a forest plot, the default) or horizontally. |
zeroLineColor |
Color of the horizontal line at an effect size of 0 (set to 'white' to not display the line; also adjust the size to 0 then). |
zeroLineSize |
Size of the horizontal line at an effect size of 0 (set to 0 to not display the line; also adjust the color to 'white' then). |
envir |
The environment where to search for the variables (useful when calling meanDiff from a function where the vectors are defined in that functions environment). |
powerDigits |
With what precision you want the power to print. |
... |
Additional arguments are passed on to the |
This function uses the meanDiff function, which uses the formulae from Borenstein, Hedges, Higgins & Rothstein (2009) (pages 25-32).
An object is returned with the following elements:
results.raw |
Objects returned by the calls to meanDiff. |
plots |
For every comparison, a plot with the datapoints, means, and confidence intervals in the two groups. |
results.compiled |
Dataframe with the most important results from each comparison. |
plots.compiled |
For every dichotomous (x) variable, a plot with the confidence interval for the effect size of each dependent (y) variable. |
input |
The arguments with which the function was called. |
Note that when different variances are assumed for the t-test (i.e. the null-hypothesis test), the values of Cohen's d are still based on the assumption that the variance is equal. In this case, the confidence interval might, for example, not contain zero even though the NHST has a non-significant p-value (the reverse can probably happen, too).
Borenstein, M., Hedges, L. V., Higgins, J. P., & Rothstein, H. R. (2011). Introduction to meta-analysis. John Wiley & Sons.
### Create simple dataset dat <- data.frame(x1 = factor(rep(c(0,1), 20)), x2 = factor(c(rep(0, 20), rep(1, 20))), y=rep(c(4,5), 20) + rnorm(40)); ### Compute mean difference and show it meanDiff.multi(dat, x=c('x1', 'x2'), y='y', var.equal="yes");
### Create simple dataset dat <- data.frame(x1 = factor(rep(c(0,1), 20)), x2 = factor(c(rep(0, 20), rep(1, 20))), y=rep(c(4,5), 20) + rnorm(40)); ### Compute mean difference and show it meanDiff.multi(dat, x=c('x1', 'x2'), y='y', var.equal="yes");
These functions allow easily computing means and sums. Note that if you
attach rosetta
to the search path,
means( ..., data = NULL, requiredValidValues = 0, returnIfInvalid = NA, silent = FALSE ) sums( ..., data = NULL, requiredValidValues = 0, returnIfInvalid = NA, silent = FALSE )
means( ..., data = NULL, requiredValidValues = 0, returnIfInvalid = NA, silent = FALSE ) sums( ..., data = NULL, requiredValidValues = 0, returnIfInvalid = NA, silent = FALSE )
... |
The dataframe or vectors for which to compute the means or sums.
When passing a dataframe as unnamed argument (i.e. in the "dots", |
data |
If a dataframe is passed as |
requiredValidValues |
The number (if larger than 1) or proportion (if between 0 and 1) of values that have to be valid (i.e. nonmissing) before the mean or sum is returned. |
returnIfInvalid |
Which value to return for rows not meeting the
criterion specified in |
silent |
Whether to suppress messages. |
The means or sums.
rosetta::means(mtcars$mpg, mtcars$disp, mtcars$wt); rosetta::means(data=mtcars, 'mpg', 'disp', 'wt'); rosetta::sums(mtcars$mpg, mtcars$disp, mtcars$wt); rosetta::sums(data=mtcars, 'mpg', 'disp', 'wt');
rosetta::means(mtcars$mpg, mtcars$disp, mtcars$wt); rosetta::means(data=mtcars, 'mpg', 'disp', 'wt'); rosetta::sums(mtcars$mpg, mtcars$disp, mtcars$wt); rosetta::sums(data=mtcars, 'mpg', 'disp', 'wt');
The oneway function wraps a number of analysis of variance functions into one convenient interface that is similar to the oneway anova command in SPSS.
oneway( y, x, posthoc = NULL, means = FALSE, fullDescribe = FALSE, levene = FALSE, plot = FALSE, digits = 2, omegasq = TRUE, etasq = TRUE, corrections = FALSE, pvalueDigits = 3, t = FALSE, conf.level = 0.95, posthocLetters = FALSE, posthocLetterAlpha = 0.05, overrideVarNames = NULL, silent = FALSE ) ## S3 method for class 'oneway' print( x, digits = x$input$digits, pvalueDigits = x$input$pvalueDigits, na.print = "", ... ) ## S3 method for class 'oneway' pander( x, digits = x$input$digits, pvalueDigits = x$input$pvalueDigits, headerStyle = "**", na.print = "", ... )
oneway( y, x, posthoc = NULL, means = FALSE, fullDescribe = FALSE, levene = FALSE, plot = FALSE, digits = 2, omegasq = TRUE, etasq = TRUE, corrections = FALSE, pvalueDigits = 3, t = FALSE, conf.level = 0.95, posthocLetters = FALSE, posthocLetterAlpha = 0.05, overrideVarNames = NULL, silent = FALSE ) ## S3 method for class 'oneway' print( x, digits = x$input$digits, pvalueDigits = x$input$pvalueDigits, na.print = "", ... ) ## S3 method for class 'oneway' pander( x, digits = x$input$digits, pvalueDigits = x$input$pvalueDigits, headerStyle = "**", na.print = "", ... )
y |
y has to be a numeric vector. |
x |
x has to be vector that either is a factor or can be converted into one. |
posthoc |
Which post-hoc tests to conduct. Valid values are any correction methods in p.adjust.methods (at the time of writing of this document, "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"), as well as "tukey" and "games-howell". |
means |
Whether to show the means for the y variable in each of the groups determined by the x variable. |
fullDescribe |
If TRUE, not only the means are shown, but all statistics acquired through the 'describe' function in the 'psych' package are shown. |
levene |
Whether to show Levene's test for equality of variances (using
|
plot |
Whether to show a plot of the means of the y variable in each of the groups determined by the x variable. |
digits |
The number of digits to show in the output. |
omegasq |
Whether to show the omega squared effect size. |
etasq |
Whether to show the eta squared effect size (this is biased and generally advised against; omega squared is less biased). |
corrections |
Whether to show the corrections for unequal variances (Welch and Brown-Forsythe). |
pvalueDigits |
The number of digits to show for p-values; smaller p-values will be shown as <.001 or <.0001 etc. |
t |
Whether to transpose the dataframes with the means (if requested) and the anova results. This can be useful for blind people. |
conf.level |
Confidence level to use when computing the confidence interval for eta^2. Note that the function we use doubles the 'unconfidence' level to maintain consistency with the NHST value (see http://yatani.jp/HCIstats/ANOVA#RCodeOneWay, http://daniellakens.blogspot.nl/2014/06/calculating-confidence-intervals-for.html or Steiger, J. H. (2004). Beyond the F test: Effect size confidence intervals and tests of close fit in the analysis of variance and contrast analysis. Psychological methods, 9(2), 164-82. doi:10.1037/1082-989X.9.2.164 |
posthocLetters |
Whether to also compute and show the letters
signifying differences between groups when conducting post hoc tests. This
requires package |
posthocLetterAlpha |
The alpha to use when determining whether groups
have different means when using |
overrideVarNames |
Can be used to override the variable names (most useful in functions). |
silent |
Whether to show warnings and other diagnostic information or remain silent. |
na.print |
How to print missing values. |
... |
Any additional arguments are passed to the |
headerStyle |
The header pre- and suffix to use when pandering the result (useful when working with Markdown). |
A list of three elements:
input |
List with input arguments |
intermediate |
List of intermediate objects, such as the aov and Anova (from the car package) objects. |
output |
List with etasq, the effect size, and dat, a dataframe with the Oneway Anova results. |
By my knowledge the Brown-Forsythe correction was not yet available in R. I took this from the original paper (directed there by Field, 2014). Note that this is the corrected F value, not the Brown-Forsythe test for normality!
Gjalt-Jorn Peters
Maintainer: Gjalt-Jorn Peters [email protected]
Brown, M., & Forsythe, A. (1974). The small sample behavior of some statistics which test the equality of several means. Technometrics, 16(1), 129-132. https://doi.org/10.2307/1267501
Field, A. (2014) Discovering statistics using SPSS (4th ed.). London: Sage.
Steiger, J. H. (2004). Beyond the F test: Effect size confidence intervals and tests of close fit in the analysis of variance and contrast analysis. Psychological methods, 9(2), 164-82. doi:10.1037/1082-989X.9.2.164
### Do a oneway Anova oneway(y=ChickWeight$weight, x=ChickWeight$Diet); ### Also order means and transpose the results oneway(y=ChickWeight$weight, x=ChickWeight$Diet, means=TRUE, t=TRUE);
### Do a oneway Anova oneway(y=ChickWeight$weight, x=ChickWeight$Diet); ### Also order means and transpose the results oneway(y=ChickWeight$weight, x=ChickWeight$Diet, means=TRUE, t=TRUE);
The rosetta::opts
object contains three functions to set, get, and reset
options used by the rosetta package. Use rosetta::opts$set
to set options,
rosetta::opts$get
to get options, or rosetta::opts$reset
to reset specific or
all options to their default values.
opts
opts
An object of class list
of length 4.
It is normally not necessary to get or set rosetta
options.
The following arguments can be passed:
For rosetta::opts$set
, the dots can be used to specify the options
to set, in the format option = value
, for example,
varViewCols = c("values", "level")
. For
rosetta::opts$reset
, a list of options to be reset can be passed.
For rosetta::opts$set
, the name of the option to set.
For rosetta::opts$get
, the default value to return if the
option has not been manually specified.
The following options can be set:
The order and names of the columns to include in the variable view.
Whether to show a warning if labeller labels are encountered.
### Get the default columns in the variable view rosetta::opts$get(varViewCols); ### Set it to a custom version rosetta::opts$set(varViewCols = c("values", "level")); ### Check that it worked rosetta::opts$get(varViewCols); ### Reset this option to its default value rosetta::opts$reset(varViewCols); ### Check that the reset worked, too rosetta::opts$get(varViewCols);
### Get the default columns in the variable view rosetta::opts$get(varViewCols); ### Set it to a custom version rosetta::opts$set(varViewCols = c("values", "level")); ### Check that it worked rosetta::opts$get(varViewCols); ### Reset this option to its default value rosetta::opts$reset(varViewCols); ### Check that the reset worked, too rosetta::opts$get(varViewCols);
This is a subsets of the Party Panel 2015 dataset. Party Panel is an annual semi-panel determinant study among Dutch nightlife patrons, where every year, the determinants of another nightlife-related risk behavior are mapped. In 2015, determinants were measured of behaviors related to using highly dosed ecstasy pills.
data(pp15)
data(pp15)
A data.frame
with 128 columns and 829 rows.
Note that many rows contain missing values; the columns and rows
were taken directly from the original Party Panel dataset, and
represent all participants that made it past a given behavior.
The full dataset is publicly available through the Open Science Framework (https://osf.io/s4fmu/). Also see the GitLab repository (https://gitlab.com/partypanel) and the website at https://partypanel.eu.
data('pp15', package='rosetta'); rosetta::freq(pp15$gender);
data('pp15', package='rosetta'); rosetta::freq(pp15$gender);
Makes plot of Index of Moderated Mediation of gemm object
plotIMM(x, ...)
plotIMM(x, ...)
x |
object moderatedMediationSem |
... |
optional |
simple slope plots for each mediator and simple slopes parameter estimates
Makes 3D plots of Index of Moderated Mediation of gemm object
plotIMM3d(x, ...)
plotIMM3d(x, ...)
x |
results of gemm function |
... |
optional |
empty, directly plots all indices of mediation
Makes simple slope plots of gemm object
plotSS(x, ...)
plotSS(x, ...)
x |
object moderatedMediationSem |
... |
optional |
simple slope plots for each mediator and simple slopes parameter estimates
This function is used by the 'oneway' function for oneway analysis of variance in case a user requests post-hoc tests using the Tukey or Games-Howell methods.
posthocTGH( y, x, method = c("games-howell", "tukey"), conf.level = 0.95, digits = 2, p.adjust = "none", formatPvalue = TRUE ) ## S3 method for class 'posthocTGH' print(x, digits = x$input$digits, ...)
posthocTGH( y, x, method = c("games-howell", "tukey"), conf.level = 0.95, digits = 2, p.adjust = "none", formatPvalue = TRUE ) ## S3 method for class 'posthocTGH' print(x, digits = x$input$digits, ...)
y |
y has to be a numeric vector. |
x |
x has to be vector that either is a factor or can be converted into one. |
method |
Which post-hoc tests to conduct. Valid values are "tukey" and "games-howell". |
conf.level |
Confidence level of the confidence intervals. |
digits |
The number of digits to show in the output. |
p.adjust |
Any valid |
formatPvalue |
Whether to format the p values according to APA standards (i.e. replace all values lower than .001 with '<.001'). This only applies to the printing of the object, not to the way the p values are stored in the object. |
... |
Any additional arguments are passed on to the |
A list of three elements:
input |
List with input arguments |
intermediate |
List of intermediate objects. |
output |
List with two objects 'tukey' and 'games.howell', containing the outcomes for the respective post-hoc tests. |
This function is based on a file that was once hosted at
http://www.psych.yorku.ca/cribbie/6130/games_howell.R, but has been removed
since. It was then adjusted for implementation in the
userfriendlyscience
package. Jeffrey Baggett needed the
confidence intervals, and so emailed them, after which his updated function
was used. In the meantime, it appears Aaron Schlegel
(https://rpubs.com/aaronsc32) independently developed a version with
confidence intervals and posted it on RPubs at
https://rpubs.com/aaronsc32/games-howell-test.
Also, for some reason, p.adjust
can be used to specify additional
correction of p values. I'm not sure why I implemented this, but I'm
not entirely sure it was a mistake either. Therefore, in
userfriendlyscience
version 0.6-2, the default of this setting
changed from "holm"
to "none"
(also see
https://stats.stackexchange.com/questions/83941/games-howell-post-hoc-test-in-r).
Gjalt-Jorn Peters (Open University of the Netherlands) & Jeff Bagget (University of Wisconsin - La Crosse)
Maintainer: Gjalt-Jorn Peters [email protected]
### Compute post-hoc statistics using the tukey method posthocTGH(y=ChickWeight$weight, x=ChickWeight$Diet, method="tukey"); ### Compute post-hoc statistics using the games-howell method posthocTGH(y=ChickWeight$weight, x=ChickWeight$Diet);
### Compute post-hoc statistics using the tukey method posthocTGH(y=ChickWeight$weight, x=ChickWeight$Diet, method="tukey"); ### Compute post-hoc statistics using the games-howell method posthocTGH(y=ChickWeight$weight, x=ChickWeight$Diet);
Computes Index of moderated mediation of gemm object
prepIMM3d(M1, M2, parEst = parEst, i = 1)
prepIMM3d(M1, M2, parEst = parEst, i = 1)
M1 |
moderator of x-m path |
M2 |
moderator of m-y path |
parEst |
parameter estimates from lavaan results |
i |
index of vector of mediators names |
vector of index of moderated mediation with CI limits for a given mediator
Makes Index of Mediated Moderated plots
prepPlotIMM( data, xvar, yvar, mod, mvars, parEst, vdichotomous, modLevels, path = NULL )
prepPlotIMM( data, xvar, yvar, mod, mvars, parEst, vdichotomous, modLevels, path = NULL )
data |
data frame containg the variables of the model |
xvar |
predictor variable name |
yvar |
depedendent variable name |
mod |
moderator name |
mvars |
vector of mediators names |
parEst |
parameter estimates from lavaan results |
vdichotomous |
indicates whether moderator is dichotomous (TRUE) |
modLevels |
levels of dichotomous moderator |
path |
which path is used |
empty, directly plots all simple slopes and all indices of mediation
Makes simple slope plots
prepPlotSS( data, xvar, yvar, mod, mvars, parEst, vdichotomous, modLevels, predLevels = NULL, xquant, yquant, path = NULL )
prepPlotSS( data, xvar, yvar, mod, mvars, parEst, vdichotomous, modLevels, predLevels = NULL, xquant, yquant, path = NULL )
data |
data frame containg the variables of the model |
xvar |
predictor variable name |
yvar |
depedendent variable name |
mod |
moderator name |
mvars |
vector of mediators names |
parEst |
parameter estimates from lavaan results |
vdichotomous |
indicates whether moderator is dichotomous (TRUE) |
modLevels |
levels of dichotomous moderator |
predLevels |
levels of dichotomous moderator |
xquant |
quantiles of x |
yquant |
quantiles of y |
path |
which path is used |
empty, directly plots all simple slopes and all indices of mediation
print method of object of class gemm
## S3 method for class 'gemm' print(x, ..., digits = 2, silence = FALSE)
## S3 method for class 'gemm' print(x, ..., digits = 2, silence = FALSE)
x |
object of class gemm |
... |
additional pars |
digits |
number of digits |
silence |
boolean, if true out is not printed |
idSlug is a convenience function with swapped argument order.
randomSlug(x = 10, id = NULL, chars = c(letters, LETTERS, 0:9)) idSlug(id = NULL, x = 10, chars = c(letters, LETTERS, 0:9))
randomSlug(x = 10, id = NULL, chars = c(letters, LETTERS, 0:9)) idSlug(id = NULL, x = 10, chars = c(letters, LETTERS, 0:9))
x |
Length of slug |
id |
If not NULL, prepended to slug (separated with a dash) as id; in that case, it's also braces and a hash is added. |
chars |
Characters to sample from |
A character value.
randomSlug(); idSlug("identifier");
randomSlug(); idSlug("identifier");
car
version)This function is from the car package. Please see that
help page for details: car::recode()
.
recode( var, recodes, as.factor, as.numeric = TRUE, levels, to.value = "=", interval = ":", separator = ";" )
recode( var, recodes, as.factor, as.numeric = TRUE, levels, to.value = "=", interval = ":", separator = ";" )
var |
numeric vector, character vector, or factor. |
recodes |
character string of recode specifications: see below. |
as.factor |
return a factor; default is |
as.numeric |
if |
levels |
an optional argument specifying the order of the levels in the returned factor; the default is to use the sort order of the level names. |
to.value |
The operator to separate old from new values, "=" by default; some other possibilities: "->", "~", "~>". Cannot include the interval operator (by default :) or the separator string (by default, ;), so, e.g., by default ":=>" is not allowed. The discussion in Details assumes the default "=". Use a non-default to.value if factor levels contain =. |
interval |
the operator used to denote numeric intervals, by default ":". The discussion in Details assumes the default ":". Use a non-default interval if factor levels contain :. |
separator |
the character string used to separate recode specifications, by default ";". The discussion in Details assumes the default ";". Use a non-default separator if factor levels contain ;. |
John Fox [email protected]
Fox, J. and Weisberg, S. (2019) An R Companion to Applied Regression, Third Edition, Sage.
x<-rep(1:3,3) x rosetta::recode( x, "c(1,2)='A'; else='B'" ); rosetta::recode( x, "1:2='A'; 3='B'" );
x<-rep(1:3,3) x rosetta::recode( x, "c(1,2)='A'; else='B'" ); rosetta::recode( x, "1:2='A'; 3='B'" );
The regr
function wraps a number of linear regression functions into
one convenient interface that provides similar output to the regression
function in SPSS. It automatically provides confidence intervals and
standardized coefficients. Note that this function is meant for teaching
purposes, and therefore it's only for very basic regression analyses; for
more functionality, use the base R function lm
or e.g. the lme4
package.
regr( formula, data = NULL, conf.level = 0.95, digits = 2, pvalueDigits = 3, coefficients = c("raw", "scaled"), plot = FALSE, pointAlpha = 0.5, collinearity = FALSE, influential = FALSE, ci.method = c("widest", "r.con", "olkinfinn"), ci.method.note = FALSE, headingLevel = 3, env = parent.frame() ) rosettaRegr_partial( x, digits = x$input$digits, pvalueDigits = x$input$pvalueDigits, headingLevel = x$input$headingLevel, echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaRegr' knit_print( x, digits = x$input$digits, headingLevel = x$input$headingLevel, pvalueDigits = x$input$pvalueDigits, echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaRegr' print( x, digits = x$input$digits, pvalueDigits = x$input$pvalueDigits, headingLevel = x$input$headingLevel, forceKnitrOutput = FALSE, ... ) ## S3 method for class 'rosettaRegr' pander(x, digits = x$input$digits, pvalueDigits = x$input$pvalueDigits, ...)
regr( formula, data = NULL, conf.level = 0.95, digits = 2, pvalueDigits = 3, coefficients = c("raw", "scaled"), plot = FALSE, pointAlpha = 0.5, collinearity = FALSE, influential = FALSE, ci.method = c("widest", "r.con", "olkinfinn"), ci.method.note = FALSE, headingLevel = 3, env = parent.frame() ) rosettaRegr_partial( x, digits = x$input$digits, pvalueDigits = x$input$pvalueDigits, headingLevel = x$input$headingLevel, echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaRegr' knit_print( x, digits = x$input$digits, headingLevel = x$input$headingLevel, pvalueDigits = x$input$pvalueDigits, echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaRegr' print( x, digits = x$input$digits, pvalueDigits = x$input$pvalueDigits, headingLevel = x$input$headingLevel, forceKnitrOutput = FALSE, ... ) ## S3 method for class 'rosettaRegr' pander(x, digits = x$input$digits, pvalueDigits = x$input$pvalueDigits, ...)
formula |
The formula of the regression analysis, of the form |
data |
If the terms in the formula aren't vectors but variable names, this should be the dataframe where those variables are stored. |
conf.level |
The confidence of the confidence interval around the regression coefficients. |
digits |
Number of digits to round the output to. |
pvalueDigits |
The number of digits to show for p-values; smaller p-values will be shown as <.001 or <.0001 etc. |
coefficients |
Which coefficients to show; can be "raw" to only show the raw (unstandardized) coefficients; "scaled" to only show the scaled (standardized) coefficients), or c("raw", "scaled') to show both. |
plot |
For regression analyses with only one predictor (also sometimes confusingly referred to as 'univariate' regression analyses), scatterplots with regression lines and their standard errors can be produced. |
pointAlpha |
The alpha channel (transparency, or rather: 'opaqueness') of the points drawn in the plot. |
collinearity |
Whether to compute and show collinearity diagnostics (specifically, the tolerance (1 - R^2, where R^2 is the one obtained when regressing each predictor on all the other predictors) and the Variance Inflation Factor (VIF), which is the reciprocal of the tolerance, i.e. VIF = 1 / tolerance). |
influential |
Whether to compute diagnostics for influential cases.
These are stored in the returned object in the |
ci.method , ci.method.note
|
Which method to use for the confidence interval around R squared, and whether to display a note about this choice. |
headingLevel |
The number of hashes to print in front of the headings when printing while knitting |
env |
The enviroment where to evaluate the formula. |
x |
The object to print (i.e. as produced by |
echoPartial |
Whether to show the executed code in the R Markdown
partial ( |
partialFile |
This can be used to specify a custom partial file. The
file will have object |
quiet |
Passed on to |
... |
Any additional arguments are passed to the default print method
by the print method, and to |
forceKnitrOutput |
Force knitr output. |
A list of three elements:
input |
List with input arguments |
intermediate |
List of intermediate objects, such as the lm and confint objects. |
output |
List with two dataframes, one with the raw coefficients, and one with the scaled coefficients. |
Gjalt-Jorn Peters
Maintainer: Gjalt-Jorn Peters [email protected]
### Do a simple regression analysis rosetta::regr(age ~ circumference, dat=Orange); ### Show more digits for the p-value rosetta::regr(Orange$age ~ Orange$circumference, pvalueDigits=18); ## Not run: ### An example with an interaction term, showing in the ### viewer rosetta::rosettaRegr_partial( rosetta::regr( mpg ~ wt + hp + wt:hp, dat=mtcars, coefficients = "raw", plot=TRUE, collinearity=TRUE ) ); ## End(Not run)
### Do a simple regression analysis rosetta::regr(age ~ circumference, dat=Orange); ### Show more digits for the p-value rosetta::regr(Orange$age ~ Orange$circumference, pvalueDigits=18); ## Not run: ### An example with an interaction term, showing in the ### viewer rosetta::rosettaRegr_partial( rosetta::regr( mpg ~ wt + hp + wt:hp, dat=mtcars, coefficients = "raw", plot=TRUE, collinearity=TRUE ) ); ## End(Not run)
The reliability()
analysis is the only one most users will need. It tries
to apply best practices by, as much as possible, complementing point
estimates with confidence intervals.
reliability( data, items = NULL, scaleStructure = TRUE, descriptives = FALSE, itemLevel = FALSE, scatterMatrix = FALSE, scatterMatrixArgs = list(progress = FALSE), digits = 2, conf.level = 0.95, itemLabels = NULL, itemOmittedCorsWithRest = FALSE, itemOmittedCorsWithTotal = FALSE, alphaOmittedCIs = FALSE, omegaFromMBESS = FALSE, omegaFromPsych = TRUE, ordinal = FALSE, headingLevel = 3, ... ) rosettaReliability_partial( x, digits = x$digits, headingLevel = x$headingLevel, printPlots = TRUE, echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaReliability' knit_print( x, digits = x$digits, headingLevel = x$headingLevel, printPlots = TRUE, echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaReliability' print( x, digits = x$digits, headingLevel = x$headingLevel, forceKnitrOutput = FALSE, printPlots = TRUE, ... )
reliability( data, items = NULL, scaleStructure = TRUE, descriptives = FALSE, itemLevel = FALSE, scatterMatrix = FALSE, scatterMatrixArgs = list(progress = FALSE), digits = 2, conf.level = 0.95, itemLabels = NULL, itemOmittedCorsWithRest = FALSE, itemOmittedCorsWithTotal = FALSE, alphaOmittedCIs = FALSE, omegaFromMBESS = FALSE, omegaFromPsych = TRUE, ordinal = FALSE, headingLevel = 3, ... ) rosettaReliability_partial( x, digits = x$digits, headingLevel = x$headingLevel, printPlots = TRUE, echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaReliability' knit_print( x, digits = x$digits, headingLevel = x$headingLevel, printPlots = TRUE, echoPartial = FALSE, partialFile = NULL, quiet = TRUE, ... ) ## S3 method for class 'rosettaReliability' print( x, digits = x$digits, headingLevel = x$headingLevel, forceKnitrOutput = FALSE, printPlots = TRUE, ... )
data |
The data frame |
items |
The items (if omitted, all columns are used) |
scaleStructure |
Whether to include scale-level estimates using
|
descriptives |
Whether to include mean and standard deviation eastimates and their confidence intervals |
itemLevel |
Whether to include item-level internal consistency estimates |
scatterMatrix , scatterMatrixArgs
|
Whether to produce a scatter matrix,
and the arguments to pass to the |
digits |
The number of digits to round the result to |
conf.level |
The confidence level of confidence intervals |
itemLabels |
Optionally, labels to use for the items (optionally, named,
with the names corresponding to the |
itemOmittedCorsWithRest , itemOmittedCorsWithTotal
|
Whether to include each item's correlations with, respectively, the scale with that item omitted, or the full scale. |
alphaOmittedCIs |
Whether to include the confidence intervals for the Coefficient Alpha estimates with the item omitted. |
omegaFromMBESS , omegaFromPsych
|
Whether to include omega from |
ordinal |
Wheher to set |
headingLevel |
The number of hashes to print in front of the headings when printing while knitting |
... |
Any additional arguments are passed to |
x |
The object to print |
printPlots |
Whether to print plots (can be used to suppress plots, which can be useful sometimes) |
echoPartial |
Whether to show the executed code in the R Markdown
partial ( |
partialFile |
This can be used to specify a custom partial file. The
file will have object |
quiet |
Passed on to |
forceKnitrOutput |
Force knitr output |
The rosettaReliability
object that is returned has
its own print()
method, that, when using knitr
, will use
the rmdpartials
package to insert an RMarkdown partial. That partial is
created using
rosettaReliability_partial()
, which is also called by a specific
knit_print()
method.
An object with all results
### These examples aren't run during tests ### because they can take quite long ## Not run: ### Simple example with only main reliability results data(pp15, package="rosetta"); rosetta::reliability( pp15, c( "highDose_AttGeneral_good", "highDose_AttGeneral_prettig", "highDose_AttGeneral_slim", "highDose_AttGeneral_gezond", "highDose_AttGeneral_spannend" ) ); ### More extensive example with an RMarkdown partial that ### displays in the viewer rosetta::rosettaReliability_partial( rosetta::reliability( attitude, descriptives = TRUE, itemLevel = TRUE, scatterMatrix = TRUE ) ); ## End(Not run)
### These examples aren't run during tests ### because they can take quite long ## Not run: ### Simple example with only main reliability results data(pp15, package="rosetta"); rosetta::reliability( pp15, c( "highDose_AttGeneral_good", "highDose_AttGeneral_prettig", "highDose_AttGeneral_slim", "highDose_AttGeneral_gezond", "highDose_AttGeneral_spannend" ) ); ### More extensive example with an RMarkdown partial that ### displays in the viewer rosetta::rosettaReliability_partial( rosetta::reliability( attitude, descriptives = TRUE, itemLevel = TRUE, scatterMatrix = TRUE ) ); ## End(Not run)
Repeat a string a number of times
repeatStr(n = 1, str = " ")
repeatStr(n = 1, str = " ")
n , str
|
Normally, respectively the frequency with which to repeat the string and the string to repeat; but the order of the inputs can be switched as well. |
A character vector of length 1.
### 10 spaces: repStr(10); ### Three euro symbols: repStr("\u20ac", 3);
### 10 spaces: repStr(10); ### Three euro symbols: repStr("\u20ac", 3);
rMatrix provides a correlation matrix with confidence intervals and a p-value adjusted for multiple testing.
rMatrix( dat, x, y = NULL, conf.level = 0.95, correction = "fdr", digits = 2, pValueDigits = 3, colspace = 2, rowspace = 0, colNames = "numbers" ) ## S3 method for class 'rMatrix' print( x, digits = x$digits, pValueDigits = x$pValueDigits, colNames = x$colNames, ... )
rMatrix( dat, x, y = NULL, conf.level = 0.95, correction = "fdr", digits = 2, pValueDigits = 3, colspace = 2, rowspace = 0, colNames = "numbers" ) ## S3 method for class 'rMatrix' print( x, digits = x$digits, pValueDigits = x$pValueDigits, colNames = x$colNames, ... )
dat |
A dataframe containing the relevant variables. |
x |
Vector of 1+ variable names. |
y |
Vector of 1+ variable names; if this is left empty, a symmetric matrix is created; if this is filled, the matrix will have the x variables defining the rows and the y variables defining the columns. |
conf.level |
The confidence of the confidence intervals. |
correction |
Correction for multiple testing: an element out of the vector c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"). NOTE: the p-values are corrected for multiple testing; The confidence intervals are not (yet :-)). |
digits |
With what precision do you want the results to print. |
pValueDigits |
Determines the number of digits to use when displaying p values. P-values that are too small will be shown as p<.001 or p<.00001 etc. |
colspace |
Number of spaces between columns |
rowspace |
Number of rows between table rows (note: one table row is 2 rows). |
colNames |
colNames can be "numbers" or "names". "Names" cause variables names to be printed in the heading; "numbers" causes the rows to become numbered and the numbers to be printed in the heading. |
... |
Additional arguments are ignored. |
rMatrix provides a symmetric or asymmetric matrix of correlations, their confidence intervals, and p-values. The p-values can be corrected for multiple testing.
An rMatrix
object that when printed shows the correlation matrix
An object with the input and several output variables. Most notably a number of matrices:
r |
Pearson r values. |
parameter |
Degrees of freedom. |
ci.lo |
Lower bound of Pearson r confidence interval. |
ci.hi |
Upper bound of Pearson r confidence interval. |
p.raw |
Original p-values. |
p.adj |
p-values adjusted for multiple testing. |
Gjalt-Jorn Peters
Maintainer: Gjalt-Jorn Peters [email protected]
rMatrix(mtcars, x=c('disp', 'hp', 'drat'))
rMatrix(mtcars, x=c('disp', 'hp', 'drat'))
scatterMatrix produces a matrix with jittered scatterplots, histograms, and correlation coefficients.
scatterMatrix( dat, items = NULL, itemLabels = NULL, plotSize = 180, sizeMultiplier = 1, pointSize = 1, axisLabels = "none", normalHist = TRUE, progress = NULL, theme = ggplot2::theme_minimal(), hideGrid = TRUE, conf.level = 0.95, ... )
scatterMatrix( dat, items = NULL, itemLabels = NULL, plotSize = 180, sizeMultiplier = 1, pointSize = 1, axisLabels = "none", normalHist = TRUE, progress = NULL, theme = ggplot2::theme_minimal(), hideGrid = TRUE, conf.level = 0.95, ... )
dat |
A dataframe containing the items in the scale. All variables in this dataframe will be used if items is NULL. |
items |
If not NULL, this should be a character vector with the names of the variables in the dataframe that represent items in the scale. |
itemLabels |
Optionally, labels to use for the items (optionally, named,
with the names corresponding to the |
plotSize |
Size of the final plot in millimeters. |
sizeMultiplier |
Allows more flexible control over the size of the plot elements |
pointSize |
Size of the points in the scatterplots |
axisLabels |
Passed to ggpairs function to set axisLabels. |
normalHist |
Whether to use the default ggpairs histogram on the
diagonal of the scattermatrix, or whether to use the |
progress |
Whether to show a progress bar; set to |
theme |
The ggplot2 theme to use. |
hideGrid |
Whether to hide the gridlines in the plot. |
conf.level |
The confidence level of confidence intervals |
... |
Additional arguments for |
An object with the input and several output variables. Most notably:
output$scatterMatrix |
A scattermatrix with histograms on the diagonal and correlation coefficients in the upper right half. |
### Note: the 'not run' is simply because running takes a lot of time, ### but these examples are all safe to run! ## Not run: ### Generate a datafile to use exampleData <- data.frame(item1=rnorm(100)); exampleData$item2 <- exampleData$item1+rnorm(100); exampleData$item3 <- exampleData$item1+rnorm(100); exampleData$item4 <- exampleData$item2+rnorm(100); exampleData$item5 <- exampleData$item2+rnorm(100); ### Use all items scatterMatrix(dat=exampleData); ## End(Not run)
### Note: the 'not run' is simply because running takes a lot of time, ### but these examples are all safe to run! ## Not run: ### Generate a datafile to use exampleData <- data.frame(item1=rnorm(100)); exampleData$item2 <- exampleData$item1+rnorm(100); exampleData$item3 <- exampleData$item1+rnorm(100); exampleData$item4 <- exampleData$item2+rnorm(100); exampleData$item5 <- exampleData$item2+rnorm(100); ### Use all items scatterMatrix(dat=exampleData); ## End(Not run)
This function is intended to provide a very easy interface to generating
pretty (and pretty versatile) ggplot2::ggplot()
scatter plots.
scatterPlot( x, y, pointsize = 3, theme = theme_bw(), regrLine = FALSE, regrCI = FALSE, regrLineCol = "blue", regrCIcol = regrLineCol, regrCIalpha = 0.25, width = 0, height = 0, position = "identity", xVarName = NULL, yVarName = NULL, ... )
scatterPlot( x, y, pointsize = 3, theme = theme_bw(), regrLine = FALSE, regrCI = FALSE, regrLineCol = "blue", regrCIcol = regrLineCol, regrCIalpha = 0.25, width = 0, height = 0, position = "identity", xVarName = NULL, yVarName = NULL, ... )
x |
The variable to plot on the X axis. |
y |
The variable to plot on the Y axis. |
pointsize |
The size of the points in the scatterplot. |
theme |
The theme to use. |
regrLine |
Whether to show the regression line. |
regrCI |
Whether to display the confidence interval around the regression line. |
regrLineCol |
The color of the regression line. |
regrCIcol |
The color of the confidence interval around the regression line. |
regrCIalpha |
The alpha value (transparency) of the confidence interval around the regression line. |
width |
If |
height |
If |
position |
Whether to 'jitter' the points (adding some random noise to
change their location slightly, used to prevent overplotting). Set to
|
xVarName , yVarName
|
Can be used to manually specify the names of the variables on the x and y axes. |
... |
And additional arguments are passed to |
Note that if position
is set to 'jitter'
, unless width
and/or
height
is set to a non-zero value, there will still not be any
jittering.
A ggplot2::ggplot()
plot is returned.
### A simple scatter plot rosetta::scatterPlot( mtcars$mpg, mtcars$hp ); ### The same scatter plot, now with a regression line ### and its confidence interval added. rosetta::scatterPlot( mtcars$mpg, mtcars$hp, regrLine=TRUE, regrCI=TRUE );
### A simple scatter plot rosetta::scatterPlot( mtcars$mpg, mtcars$hp ); ### The same scatter plot, now with a regression line ### and its confidence interval added. rosetta::scatterPlot( mtcars$mpg, mtcars$hp, regrLine=TRUE, regrCI=TRUE );
This function provides an overview of the variables in a dataframe, allowing efficient inspection of the factor levels, ranges for numeric variables, and numbers of missing values.
varView( data, columns = names(data), varViewCols = rosetta::opts$get(varViewCols), varViewRownames = TRUE, maxLevels = 10, truncLevelsAt = 50, showLabellerWarning = rosetta::opts$get(showLabellerWarning), output = rosetta::opts$get("tableOutput") ) ## S3 method for class 'rosettaVarView' print(x, output = attr(x, "output"), ...)
varView( data, columns = names(data), varViewCols = rosetta::opts$get(varViewCols), varViewRownames = TRUE, maxLevels = 10, truncLevelsAt = 50, showLabellerWarning = rosetta::opts$get(showLabellerWarning), output = rosetta::opts$get("tableOutput") ) ## S3 method for class 'rosettaVarView' print(x, output = attr(x, "output"), ...)
data |
The dataframe containing the variables to view. |
columns |
The columns to include. |
varViewCols |
The columns of the variable view. |
varViewRownames |
Whether to set the variable names as row names of the variable view dataframe that is returned. |
maxLevels |
For factors, the maximum number of levels to show. |
truncLevelsAt |
For factors levels, the number of characters at which to truncate. |
showLabellerWarning |
Whether to show a warning if labeller labels are encountered. |
output |
A character vector containing one or more of
" |
x |
The varView data frame to print. |
... |
Any additional arguments are passed along to
the |
A dataframe with the variable view.
Gjalt-Jorn Peters & Melissa Gordon Wolf
### The default variable view rosetta::varView(iris); ### Only for a few variables in the dataset rosetta::varView(iris, columns=c("Sepal.Length", "Species")); ### Set some variable and value labels using the `labelled` ### standard, which is also used by `haven` dat <- iris; attr(dat$Sepal.Length, "label") <- "Sepal length"; attr(dat$Sepal.Length, "labels") <- c('one' = 1, 'two' = 2, 'three' = 3); ### varView automatically recognizes and shows these, adding ### a 'label' column rosetta::varView(dat); ### You can also specify that you only want to see some columns ### in the variable view rosetta::varView(dat, varViewCols = c('label', 'values', 'level'));
### The default variable view rosetta::varView(iris); ### Only for a few variables in the dataset rosetta::varView(iris, columns=c("Sepal.Length", "Species")); ### Set some variable and value labels using the `labelled` ### standard, which is also used by `haven` dat <- iris; attr(dat$Sepal.Length, "label") <- "Sepal length"; attr(dat$Sepal.Length, "labels") <- c('one' = 1, 'two' = 2, 'three' = 3); ### varView automatically recognizes and shows these, adding ### a 'label' column rosetta::varView(dat); ### You can also specify that you only want to see some columns ### in the variable view rosetta::varView(dat, varViewCols = c('label', 'values', 'level'));
vecTxtQ
, vecTxtB
, and vecTxtM
and are convenience functions
with default quotes that can be useful when working in R Markdown
documents.
vecTxt( vector, delimiter = ", ", useQuote = "", firstDelimiter = NULL, lastDelimiter = " & ", firstElements = 0, lastElements = 1, lastHasPrecedence = TRUE ) vecTxtQ(vector, useQuote = "'", ...) vecTxtB(vector, useQuote = "`", ...) vecTxtM(vector, useQuote = "$", ...)
vecTxt( vector, delimiter = ", ", useQuote = "", firstDelimiter = NULL, lastDelimiter = " & ", firstElements = 0, lastElements = 1, lastHasPrecedence = TRUE ) vecTxtQ(vector, useQuote = "'", ...) vecTxtB(vector, useQuote = "`", ...) vecTxtM(vector, useQuote = "$", ...)
vector |
The vector to process. |
delimiter , firstDelimiter , lastDelimiter
|
The delimiters
to use for respectively the middle, first
|
useQuote |
This character string is pre- and appended to all elements;
so use this to quote all elements ( |
firstElements , lastElements
|
The number of elements for which to use the first respective last delimiters |
lastHasPrecedence |
If the vector is very short, it's possible that the
sum of firstElements and lastElements is larger than the vector length. In
that case, downwardly adjust the number of elements to separate with the
first delimiter ( |
... |
Any addition arguments to |
A character vector of length 1.
vecTxtQ(names(mtcars));
vecTxtQ(names(mtcars));