Title: | Data Menu for Radiant: Business Analytics using R and Shiny |
---|---|
Description: | The Radiant Data menu includes interfaces for loading, saving, viewing, visualizing, summarizing, transforming, and combining data. It also contains functionality to generate reproducible reports of the analyses conducted in the application. |
Authors: | Vincent Nijs [aut, cre], Niklas von Hertzen [aut] (html2canvas library) |
Maintainer: | Vincent Nijs <[email protected]> |
License: | AGPL-3 | file LICENSE |
Version: | 1.6.7 |
Built: | 2024-10-24 04:29:55 UTC |
Source: | CRAN |
Convenience function to add a class
add_class(x, cl)
add_class(x, cl)
x |
Object |
cl |
Vector of class labels to add |
foo <- "some text" %>% add_class("text") foo <- "some text" %>% add_class(c("text", "another class"))
foo <- "some text" %>% add_class("text") foo <- "some text" %>% add_class(c("text", "another class"))
Convenience function to add a markdown description to a data.frame
add_description(df, md = "", path = "")
add_description(df, md = "", path = "")
df |
A data.frame or tibble |
md |
Data description in markdown format |
path |
Path to a text file with the data description in markdown format |
See also register
if (interactive()) { mt <- mtcars |> add_description(md = "# MTCARS\n\nThis data.frame contains information on ...") describe(mt) }
if (interactive()) { mt <- mtcars |> add_description(md = "# MTCARS\n\nThis data.frame contains information on ...") describe(mt) }
Arrange data with user-specified expression
arrange_data(dataset, expr = NULL)
arrange_data(dataset, expr = NULL)
dataset |
Data frame to arrange |
expr |
Expression to use arrange rows from the specified dataset |
Arrange data, likely in combination with slicing
Arranged data frame
Wrapper for as.character
as_character(x)
as_character(x)
x |
Input vector |
Distance in kilometers or miles between two locations based on lat-long Function based on http://www.movable-type.co.uk/scripts/latlong.html. Uses the haversine formula
as_distance( lat1, long1, lat2, long2, unit = "km", R = c(km = 6371, miles = 3959)[[unit]] )
as_distance( lat1, long1, lat2, long2, unit = "km", R = c(km = 6371, miles = 3959)[[unit]] )
lat1 |
Latitude of location 1 |
long1 |
Longitude of location 1 |
lat2 |
Latitude of location 2 |
long2 |
Longitude of location 2 |
unit |
Measure kilometers ("km", default) or miles ("miles") |
R |
Radius of the earth |
Distance between two points
as_distance(32.8245525, -117.0951632, 40.7033127, -73.979681, unit = "km") as_distance(32.8245525, -117.0951632, 40.7033127, -73.979681, unit = "miles")
as_distance(32.8245525, -117.0951632, 40.7033127, -73.979681, unit = "km") as_distance(32.8245525, -117.0951632, 40.7033127, -73.979681, unit = "miles")
Convert input in day-month-year format to date
as_dmy(x)
as_dmy(x)
x |
Input variable |
Date variable of class Date
as_dmy("1-2-2014")
as_dmy("1-2-2014")
Convert input in day-month-year-hour-minute format to date-time
as_dmy_hm(x)
as_dmy_hm(x)
x |
Input variable |
Date-time variable of class Date
as_mdy_hm("1-1-2014 12:15")
as_mdy_hm("1-1-2014 12:15")
Convert input in day-month-year-hour-minute-second format to date-time
as_dmy_hms(x)
as_dmy_hms(x)
x |
Input variable |
Date-time variable of class Date
as_mdy_hms("1-1-2014 12:15:01")
as_mdy_hms("1-1-2014 12:15:01")
Wrapper for lubridate's as.duration function. Result converted to numeric
as_duration(x)
as_duration(x)
x |
Time difference |
Wrapper for factor with ordered = FALSE
as_factor(x, ordered = FALSE)
as_factor(x, ordered = FALSE)
x |
Input vector |
ordered |
Order factor levels (TRUE, FALSE) |
Convert input in hour-minute format to time
as_hm(x)
as_hm(x)
x |
Input variable |
Time variable of class Period
as_hm("12:45") ## Not run: as_hm("12:45") %>% minute() ## End(Not run)
as_hm("12:45") ## Not run: as_hm("12:45") %>% minute() ## End(Not run)
Convert input in hour-minute-second format to time
as_hms(x)
as_hms(x)
x |
Input variable |
Time variable of class Period
as_hms("12:45:00") ## Not run: as_hms("12:45:00") %>% hour() as_hms("12:45:00") %>% second() ## End(Not run)
as_hms("12:45:00") ## Not run: as_hms("12:45:00") %>% hour() as_hms("12:45:00") %>% second() ## End(Not run)
Convert variable to integer avoiding potential issues with factors
as_integer(x)
as_integer(x)
x |
Input variable |
Integer
as_integer(rnorm(10)) as_integer(letters) as_integer(as.factor(5:10)) as.integer(as.factor(5:10)) as_integer(c("a", "b")) as_integer(c("0", "1")) as_integer(as.factor(c("0", "1")))
as_integer(rnorm(10)) as_integer(letters) as_integer(as.factor(5:10)) as.integer(as.factor(5:10)) as_integer(c("a", "b")) as_integer(c("0", "1")) as_integer(as.factor(c("0", "1")))
Convert input in month-day-year format to date
as_mdy(x)
as_mdy(x)
x |
Input variable |
Use as.character if x is a factor
Date variable of class Date
as_mdy("2-1-2014") ## Not run: as_mdy("2-1-2014") %>% month(label = TRUE) as_mdy("2-1-2014") %>% week() as_mdy("2-1-2014") %>% wday(label = TRUE) ## End(Not run)
as_mdy("2-1-2014") ## Not run: as_mdy("2-1-2014") %>% month(label = TRUE) as_mdy("2-1-2014") %>% week() as_mdy("2-1-2014") %>% wday(label = TRUE) ## End(Not run)
Convert input in month-day-year-hour-minute format to date-time
as_mdy_hm(x)
as_mdy_hm(x)
x |
Input variable |
Date-time variable of class Date
as_mdy_hm("1-1-2014 12:15")
as_mdy_hm("1-1-2014 12:15")
Convert input in month-day-year-hour-minute-second format to date-time
as_mdy_hms(x)
as_mdy_hms(x)
x |
Input variable |
Date-time variable of class Date
as_mdy_hms("1-1-2014 12:15:01")
as_mdy_hms("1-1-2014 12:15:01")
Convert variable to numeric avoiding potential issues with factors
as_numeric(x)
as_numeric(x)
x |
Input variable |
Numeric
as_numeric(rnorm(10)) as_numeric(letters) as_numeric(as.factor(5:10)) as.numeric(as.factor(5:10)) as_numeric(c("a", "b")) as_numeric(c("3", "4")) as_numeric(as.factor(c("3", "4")))
as_numeric(rnorm(10)) as_numeric(letters) as_numeric(as.factor(5:10)) as.numeric(as.factor(5:10)) as_numeric(c("a", "b")) as_numeric(c("3", "4")) as_numeric(as.factor(c("3", "4")))
Convert input in year-month-day format to date
as_ymd(x)
as_ymd(x)
x |
Input variable |
Date variable of class Date
as_ymd("2013-1-1")
as_ymd("2013-1-1")
Convert input in year-month-day-hour-minute format to date-time
as_ymd_hm(x)
as_ymd_hm(x)
x |
Input variable |
Date-time variable of class Date
as_ymd_hm("2014-1-1 12:15")
as_ymd_hm("2014-1-1 12:15")
Convert input in year-month-day-hour-minute-second format to date-time
as_ymd_hms(x)
as_ymd_hms(x)
x |
Input variable |
Date-time variable of class Date
as_ymd_hms("2014-1-1 12:15:01") ## Not run: as_ymd_hms("2014-1-1 12:15:01") %>% as.Date() as_ymd_hms("2014-1-1 12:15:01") %>% month() as_ymd_hms("2014-1-1 12:15:01") %>% hour() ## End(Not run)
as_ymd_hms("2014-1-1 12:15:01") ## Not run: as_ymd_hms("2014-1-1 12:15:01") %>% as.Date() as_ymd_hms("2014-1-1 12:15:01") %>% month() as_ymd_hms("2014-1-1 12:15:01") %>% hour() ## End(Not run)
Avengers
data(avengers)
data(avengers)
A data frame with 7 rows and 4 variables
List of avengers. The dataset is used to illustrate data merging / joining. Description provided in attr(avengers,"description")
Center
center(x, na.rm = TRUE)
center(x, na.rm = TRUE)
x |
Input variable |
na.rm |
If TRUE missing values are removed before calculation |
If x is a numeric variable return x - mean(x)
Choose a directory interactively
choose_dir(...)
choose_dir(...)
... |
Arguments passed to utils::choose.dir on Windows |
Open a file dialog to select a directory. Uses JavaScript on Mac, utils::choose.dir on Windows, and dirname(file.choose()) on Linux
Path to the directory selected by the user
## Not run: choose_dir() ## End(Not run)
## Not run: choose_dir() ## End(Not run)
Choose files interactively
choose_files(...)
choose_files(...)
... |
Strings used to indicate which file types should be available for selection (e.g., "csv" or "pdf") |
Open a file dialog. Uses JavaScript on Mac, utils::choose.files on Windows, and file.choose() on Linux
Vector of paths to files selected by the user
## Not run: choose_files("pdf", "csv") ## End(Not run)
## Not run: choose_files("pdf", "csv") ## End(Not run)
Labels for confidence intervals
ci_label(alt = "two.sided", cl = 0.95, dec = 3)
ci_label(alt = "two.sided", cl = 0.95, dec = 3)
alt |
Type of hypothesis ("two.sided","less","greater") |
cl |
Confidence level |
dec |
Number of decimals to show |
A character vector with labels for a confidence interval
ci_label("less", .95) ci_label("two.sided", .95) ci_label("greater", .9)
ci_label("less", .95) ci_label("two.sided", .95) ci_label("greater", .9)
Values at confidence levels
ci_perc(dat, alt = "two.sided", cl = 0.95)
ci_perc(dat, alt = "two.sided", cl = 0.95)
dat |
Data |
alt |
Type of hypothesis ("two.sided","less","greater") |
cl |
Confidence level |
A vector with values at a confidence level
ci_perc(0:100, "less", .95) ci_perc(0:100, "greater", .95) ci_perc(0:100, "two.sided", .80)
ci_perc(0:100, "less", .95) ci_perc(0:100, "greater", .95) ci_perc(0:100, "two.sided", .80)
Combine datasets using dplyr's bind and join functions
combine_data( x, y, by = "", add = "", type = "inner_join", data_filter = "", arr = "", rows = NULL, envir = parent.frame(), ... )
combine_data( x, y, by = "", add = "", type = "inner_join", data_filter = "", arr = "", rows = NULL, envir = parent.frame(), ... )
x |
Dataset |
y |
Dataset to combine with x |
by |
Variables used to combine 'x' and 'y' |
add |
Variables to add from 'y' |
type |
The main bind and join types from the dplyr package are provided. inner_join returns all rows from x with matching values in y, and all columns from x and y. If there are multiple matches between x and y, all match combinations are returned. left_join returns all rows from x, and all columns from x and y. If there are multiple matches between x and y, all match combinations are returned. right_join is equivalent to a left join for datasets y and x. full_join combines two datasets, keeping rows and columns that appear in either. semi_join returns all rows from x with matching values in y, keeping just columns from x. A semi join differs from an inner join because an inner join will return one row of x for each matching row of y, whereas a semi join will never duplicate rows of x. anti_join returns all rows from x without matching values in y, keeping only columns from x. bind_rows and bind_cols are also included, as are intersect, union, and setdiff. See https://radiant-rstats.github.io/docs/data/combine.html for further details |
data_filter |
Expression used to filter the dataset. This should be a string (e.g., "price > 10000") |
arr |
Expression to arrange (sort) the data on (e.g., "color, desc(price)") |
rows |
Rows to select from the specified dataset |
envir |
Environment to extract data from |
... |
further arguments passed to or from other methods |
See https://radiant-rstats.github.io/docs/data/combine.html for an example in Radiant
Combined dataset
avengers %>% combine_data(superheroes, type = "bind_cols") combine_data(avengers, superheroes, type = "bind_cols") avengers %>% combine_data(superheroes, type = "bind_rows") avengers %>% combine_data(superheroes, add = "publisher", type = "bind_rows")
avengers %>% combine_data(superheroes, type = "bind_cols") combine_data(avengers, superheroes, type = "bind_cols") avengers %>% combine_data(superheroes, type = "bind_rows") avengers %>% combine_data(superheroes, add = "publisher", type = "bind_rows")
Source all package functions
copy_all(.from)
copy_all(.from)
.from |
The package to pull the function from |
Equivalent of source with local=TRUE for all package functions. Adapted from functions by smbache, author of the import package. See https://github.com/rticulate/import/issues/4/ for a discussion. This function will be deprecated when (if) it is included in https://github.com/rticulate/import/
copy_all(radiant.data)
copy_all(radiant.data)
Copy attributes from one object to another
copy_attr(to, from, attr)
copy_attr(to, from, attr)
to |
Object to copy attributes to |
from |
Object to copy attributes from |
attr |
Vector of attributes. If missing all attributes will be copied |
Source for package functions
copy_from(.from, ...)
copy_from(.from, ...)
.from |
The package to pull the function from |
... |
Functions to pull |
Equivalent of source with local=TRUE for package functions. Written by smbache, author of the import package. See https://github.com/rticulate/import/issues/4/ for a discussion. This function will be deprecated when (if) it is included in https://github.com/rticulate/import/
copy_from(radiant.data, get_data)
copy_from(radiant.data, get_data)
Coefficient of variation
cv(x, na.rm = TRUE)
cv(x, na.rm = TRUE)
x |
Input variable |
na.rm |
If TRUE missing values are removed before calculation |
Coefficient of variation
cv(runif(100))
cv(runif(100))
Deregister a data.frame or list in Radiant
deregister( dataset, shiny = shiny::getDefaultReactiveDomain(), envir = r_data, info = r_info )
deregister( dataset, shiny = shiny::getDefaultReactiveDomain(), envir = r_data, info = r_info )
dataset |
String containing the name of the data.frame to deregister |
shiny |
Check if function is called from a shiny application |
envir |
Environment to remove data from |
info |
Reactive list with information about available data in radiant |
Show dataset description
describe(dataset, envir = parent.frame())
describe(dataset, envir = parent.frame())
dataset |
Dataset with "description" attribute |
envir |
Environment to extract data from |
Show dataset description, if available, in html form in Rstudio viewer or the default browser. The description should be in markdown format, attached to a data.frame as an attribute with the name "description"
Diamond prices
data(diamonds)
data(diamonds)
A data frame with 3000 rows and 10 variables
A sample of 3,000 from the diamonds dataset bundled with ggplot2. Description provided in attr(diamonds,"description")
Does a vector have non-zero variability?
does_vary(x, na.rm = TRUE)
does_vary(x, na.rm = TRUE)
x |
Input variable |
na.rm |
If TRUE missing values are removed before calculation |
Logical. TRUE is there is variability
summarise_all(diamonds, does_vary) %>% as.logical()
summarise_all(diamonds, does_vary) %>% as.logical()
Method to create datatables
dtab(object, ...)
dtab(object, ...)
object |
Object of relevant class to render |
... |
Additional arguments |
See dtab.data.frame
to create an interactive table from a data.frame
See dtab.explore
to create an interactive table from an explore
object
See dtab.pivotr
to create an interactive table from a pivotr
object
Create an interactive table to view, search, sort, and filter data
## S3 method for class 'data.frame' dtab( object, vars = "", filt = "", arr = "", rows = NULL, nr = NULL, na.rm = FALSE, dec = 3, perc = "", filter = "top", pageLength = 10, dom = "", style = "bootstrap4", rownames = FALSE, caption = NULL, envir = parent.frame(), ... )
## S3 method for class 'data.frame' dtab( object, vars = "", filt = "", arr = "", rows = NULL, nr = NULL, na.rm = FALSE, dec = 3, perc = "", filter = "top", pageLength = 10, dom = "", style = "bootstrap4", rownames = FALSE, caption = NULL, envir = parent.frame(), ... )
object |
Data.frame to display |
vars |
Variables to show (default is all) |
filt |
Filter to apply to the specified dataset. For example "price > 10000" if dataset is "diamonds" (default is "") |
arr |
Expression to arrange (sort) the data on (e.g., "color, desc(price)") |
rows |
Select rows in the specified dataset. For example "1:10" for the first 10 rows or "n()-10:n()" for the last 10 rows (default is NULL) |
nr |
Number of rows of data to include in the table. This function will be mainly used in reports so it is best to keep this number small |
na.rm |
Remove rows with missing values (default is FALSE) |
dec |
Number of decimal places to show. Default is no rounding (NULL) |
perc |
Vector of column names to be displayed as a percentage |
filter |
Show column filters in DT table. Options are "none", "top", "bottom" |
pageLength |
Number of rows to show in table |
dom |
Table control elements to show on the page. See https://datatables.net/reference/option/dom |
style |
Table formatting style ("bootstrap" or "default") |
rownames |
Show data.frame rownames. Default is FALSE |
caption |
Table caption |
envir |
Environment to extract data from |
... |
Additional arguments |
View, search, sort, and filter a data.frame. For styling options see https://rstudio.github.io/DT/functions.html
## Not run: dtab(mtcars) ## End(Not run)
## Not run: dtab(mtcars) ## End(Not run)
Make an interactive table of summary statistics
## S3 method for class 'explore' dtab( object, dec = 3, searchCols = NULL, order = NULL, pageLength = NULL, caption = NULL, ... )
## S3 method for class 'explore' dtab( object, dec = 3, searchCols = NULL, order = NULL, pageLength = NULL, caption = NULL, ... )
object |
Return value from |
dec |
Number of decimals to show |
searchCols |
Column search and filter |
order |
Column sorting |
pageLength |
Page length |
caption |
Table caption |
... |
further arguments passed to or from other methods |
See https://radiant-rstats.github.io/docs/data/explore.html for an example in Radiant
pivotr
to create a pivot table
summary.pivotr
to show summaries
## Not run: tab <- explore(diamonds, "price:x") %>% dtab() tab <- explore(diamonds, "price", byvar = "cut", fun = c("n_obs", "skew"), top = "byvar") %>% dtab() ## End(Not run)
## Not run: tab <- explore(diamonds, "price:x") %>% dtab() tab <- explore(diamonds, "price", byvar = "cut", fun = c("n_obs", "skew"), top = "byvar") %>% dtab() ## End(Not run)
Make an interactive pivot table
## S3 method for class 'pivotr' dtab( object, format = "none", perc = FALSE, dec = 3, searchCols = NULL, order = NULL, pageLength = NULL, caption = NULL, ... )
## S3 method for class 'pivotr' dtab( object, format = "none", perc = FALSE, dec = 3, searchCols = NULL, order = NULL, pageLength = NULL, caption = NULL, ... )
object |
Return value from |
format |
Show Color bar ("color_bar"), Heat map ("heat"), or None ("none") |
perc |
Display numbers as percentages (TRUE or FALSE) |
dec |
Number of decimals to show |
searchCols |
Column search and filter |
order |
Column sorting |
pageLength |
Page length |
caption |
Table caption |
... |
further arguments passed to or from other methods |
See https://radiant-rstats.github.io/docs/data/pivotr.html for an example in Radiant
pivotr
to create the pivot table
summary.pivotr
to print the table
## Not run: pivotr(diamonds, cvars = "cut") %>% dtab() pivotr(diamonds, cvars = c("cut", "clarity")) %>% dtab(format = "color_bar") pivotr(diamonds, cvars = c("cut", "clarity"), normalize = "total") %>% dtab(format = "color_bar", perc = TRUE) ## End(Not run)
## Not run: pivotr(diamonds, cvars = "cut") %>% dtab() pivotr(diamonds, cvars = c("cut", "clarity")) %>% dtab(format = "color_bar") pivotr(diamonds, cvars = c("cut", "clarity"), normalize = "total") %>% dtab(format = "color_bar", perc = TRUE) ## End(Not run)
Convert categorical variables to factors and deal with empty/missing values
empty_level(x)
empty_level(x)
x |
Categorical variable used in table |
Variable with updated levels
Explore and summarize data
explore( dataset, vars = "", byvar = "", fun = c("mean", "sd"), top = "fun", tabfilt = "", tabsort = "", tabslice = "", nr = Inf, data_filter = "", arr = "", rows = NULL, envir = parent.frame() )
explore( dataset, vars = "", byvar = "", fun = c("mean", "sd"), top = "fun", tabfilt = "", tabsort = "", tabslice = "", nr = Inf, data_filter = "", arr = "", rows = NULL, envir = parent.frame() )
dataset |
Dataset to explore |
vars |
(Numeric) variables to summarize |
byvar |
Variable(s) to group data by |
fun |
Functions to use for summarizing |
top |
Use functions ("fun"), variables ("vars"), or group-by variables as column headers |
tabfilt |
Expression used to filter the table (e.g., "Total > 10000") |
tabsort |
Expression used to sort the table (e.g., "desc(Total)") |
tabslice |
Expression used to filter table (e.g., "1:5") |
nr |
Number of rows to display |
data_filter |
Expression used to filter the dataset before creating the table (e.g., "price > 10000") |
arr |
Expression to arrange (sort) the data on (e.g., "color, desc(price)") |
rows |
Rows to select from the specified dataset |
envir |
Environment to extract data from |
See https://radiant-rstats.github.io/docs/data/explore.html for an example in Radiant
A list of all variables defined in the function as an object of class explore
See summary.explore
to show summaries
explore(diamonds, c("price", "carat")) %>% str() explore(diamonds, "price:x")$tab explore(diamonds, c("price", "carat"), byvar = "cut", fun = c("n_missing", "skew"))$tab
explore(diamonds, c("price", "carat")) %>% str() explore(diamonds, "price:x")$tab explore(diamonds, c("price", "carat"), byvar = "cut", fun = c("n_missing", "skew"))$tab
Filter data with user-specified expression
filter_data(dataset, filt = "", drop = TRUE)
filter_data(dataset, filt = "", drop = TRUE)
dataset |
Data frame to filter |
filt |
Filter expression to apply to the specified dataset |
drop |
Drop unused factor levels after filtering (default is TRUE) |
Filters can be used to view a sample from a selected dataset. For example, runif(nrow(.)) > .9 could be used to sample approximately 10
Filtered data frame
select(diamonds, 1:3) %>% filter_data(filt = "price > max(.$price) - 100") select(diamonds, 1:3) %>% filter_data(filt = "runif(nrow(.)) > .995")
select(diamonds, 1:3) %>% filter_data(filt = "price > max(.$price) - 100") select(diamonds, 1:3) %>% filter_data(filt = "runif(nrow(.)) > .995")
Find Dropbox folder
find_dropbox(account = 1)
find_dropbox(account = 1)
account |
Integer. If multiple accounts exist, specify which one to use. By default, the first account listed is used |
Find the path for Dropbox if available
Path to Dropbox account
Find Google Drive folder
find_gdrive()
find_gdrive()
Find the path for Google Drive if available
Path to Google Drive folder
Find user directory
find_home()
find_home()
Returns /Users/x and not /Users/x/Documents
Find the Rstudio project folder
find_project(mess = TRUE)
find_project(mess = TRUE)
mess |
Show or hide messages (default mess = TRUE) |
Find the path for the Rstudio project folder if available. The returned path is normalized (see normalizePath
)
Path to Rstudio project folder if available or else and empty string. The returned path is normalized
Ensure column names are valid
fix_names(x, lower = FALSE)
fix_names(x, lower = FALSE)
x |
Data.frame or vector of (column) names |
lower |
Set letters to lower case (TRUE or FALSE) |
Remove symbols, trailing and leading spaces, and convert to valid R column names. Opinionated version of make.names
fix_names(c(" var-name ", "$amount spent", "100"))
fix_names(c(" var-name ", "$amount spent", "100"))
Replace smart quotes etc.
fix_smart(text, all = FALSE)
fix_smart(text, all = FALSE)
text |
Text to be parsed |
all |
Should all non-ascii characters be removed? Default is FALSE |
Flip the DT table to put Function, Variable, or Group by on top
flip(expl, top = "fun")
flip(expl, top = "fun")
expl |
Return value from |
top |
The variable (type) to display at the top of the table ("fun" for Function, "var" for Variable, and "byvar" for Group by. "fun" is the default |
See https://radiant-rstats.github.io/docs/data/explore.html for an example in Radiant
explore
to calculate summaries
summary.explore
to show summaries
dtab.explore
to create the DT table
explore(diamonds, "price:x", top = "var") %>% summary() explore(diamonds, "price", byvar = "cut", fun = c("n_obs", "skew"), top = "byvar") %>% summary()
explore(diamonds, "price:x", top = "var") %>% summary() explore(diamonds, "price", byvar = "cut", fun = c("n_obs", "skew"), top = "byvar") %>% summary()
Format a data.frame with a specified number of decimal places
format_df(tbl, dec = NULL, perc = FALSE, mark = "", na.rm = FALSE, ...)
format_df(tbl, dec = NULL, perc = FALSE, mark = "", na.rm = FALSE, ...)
tbl |
Data.frame |
dec |
Number of decimals to show |
perc |
Display numbers as percentages (TRUE or FALSE) |
mark |
Thousand separator |
na.rm |
Remove missing values |
... |
Additional arguments for format_nr |
Data.frame for printing
data.frame(x = c("a", "b"), y = c(1L, 2L), z = c(-0.0005, 3)) %>% format_df(dec = 4) data.frame(x = c(1L, 2L), y = c(0.06, 0.8)) %>% format_df(dec = 2, perc = TRUE) data.frame(x = c(1L, 2L, NA), y = c(NA, 1.008, 2.8)) %>% format_df(dec = 2)
data.frame(x = c("a", "b"), y = c(1L, 2L), z = c(-0.0005, 3)) %>% format_df(dec = 4) data.frame(x = c(1L, 2L), y = c(0.06, 0.8)) %>% format_df(dec = 2, perc = TRUE) data.frame(x = c(1L, 2L, NA), y = c(NA, 1.008, 2.8)) %>% format_df(dec = 2)
Format a number with a specified number of decimal places, thousand sep, and a symbol
format_nr(x, sym = "", dec = 2, perc = FALSE, mark = ",", na.rm = TRUE, ...)
format_nr(x, sym = "", dec = 2, perc = FALSE, mark = ",", na.rm = TRUE, ...)
x |
Number or vector |
sym |
Symbol to use |
dec |
Number of decimals to show |
perc |
Display number as a percentage |
mark |
Thousand separator |
na.rm |
Remove missing values |
... |
Additional arguments passed to |
Character (vector) in the desired format
format_nr(2000, "$") format_nr(2000, dec = 4) format_nr(.05, perc = TRUE) format_nr(c(.1, .99), perc = TRUE) format_nr(data.frame(a = c(.1, .99)), perc = TRUE) format_nr(data.frame(a = 1:10), sym = "$", dec = 0) format_nr(c(1, 1.9, 1.008, 1.00)) format_nr(c(1, 1.9, 1.008, 1.00), drop0trailing = TRUE) format_nr(NA) format_nr(NULL)
format_nr(2000, "$") format_nr(2000, dec = 4) format_nr(.05, perc = TRUE) format_nr(c(.1, .99), perc = TRUE) format_nr(data.frame(a = c(.1, .99)), perc = TRUE) format_nr(data.frame(a = 1:10), sym = "$", dec = 0) format_nr(c(1, 1.9, 1.008, 1.00)) format_nr(c(1, 1.9, 1.008, 1.00), drop0trailing = TRUE) format_nr(NA) format_nr(NULL)
Get variable class
get_class(dat)
get_class(dat)
dat |
Dataset to evaluate |
Get variable class information for each column in a data.frame
Vector with class information for each variable
get_class(mtcars)
get_class(mtcars)
Select variables and filter data
get_data( dataset, vars = "", filt = "", arr = "", rows = NULL, data_view_rows = NULL, na.rm = TRUE, rev = FALSE, envir = c() )
get_data( dataset, vars = "", filt = "", arr = "", rows = NULL, data_view_rows = NULL, na.rm = TRUE, rev = FALSE, envir = c() )
dataset |
Dataset or name of the data.frame |
vars |
Variables to extract from the data.frame |
filt |
Filter to apply to the specified dataset |
arr |
Expression to use to arrange (sort) the specified dataset |
rows |
Select rows in the specified dataset |
data_view_rows |
Vector of rows to select. Only used by Data > View in Radiant. Users should use "rows" instead |
na.rm |
Remove rows with missing values (default is TRUE) |
rev |
Reverse filter and row selection (i.e., get the remainder) |
envir |
Environment to extract data from |
Function is used in radiant to select variables and filter data based on user input in string form
Data.frame with specified columns and rows
get_data(mtcars, vars = "cyl:vs", filt = "mpg > 25") get_data(mtcars, vars = c("mpg", "cyl"), rows = 1:10) get_data(mtcars, vars = c("mpg", "cyl"), arr = "desc(mpg)", rows = "1:5")
get_data(mtcars, vars = "cyl:vs", filt = "mpg > 25") get_data(mtcars, vars = c("mpg", "cyl"), rows = 1:10) get_data(mtcars, vars = c("mpg", "cyl"), arr = "desc(mpg)", rows = "1:5")
Create data.frame summary
get_summary(dataset, dc = get_class(dataset), dec = 3)
get_summary(dataset, dc = get_class(dataset), dec = 3)
dataset |
Data.frame |
dc |
Class for each variable |
dec |
Number of decimals to show |
Used in Radiant's Data > Transform tab
Work around to avoid (harmless) messages from ggplotly
ggplotly(...)
ggplotly(...)
... |
Arguments to pass to the |
See the ggplotly
function in the plotly package for details (?plotly::ggplotly)
Find index corrected for missing values and filters
indexr(dataset, vars = "", filt = "", arr = "", rows = NULL, cmd = "")
indexr(dataset, vars = "", filt = "", arr = "", rows = NULL, cmd = "")
dataset |
Dataset |
vars |
Variables to select |
filt |
Data filter |
arr |
Expression to arrange (sort) the data on (e.g., "color, desc(price)") |
rows |
Selected rows |
cmd |
A command used to customize the data |
Install webshot and phantomjs
install_webshot()
install_webshot()
Calculate inverse of a variable
inverse(x)
inverse(x)
x |
Input variable |
1/x
Is input a double (and not a date type)?
is_double(x)
is_double(x)
x |
Input |
TRUE if double and not a type of date, else FALSE
Convenience function for is.null or is.na
is_not(x)
is_not(x)
x |
Input |
is_not(NA) is_not(NULL) is_not(c()) is_not(list()) is_not(data.frame())
is_not(NA) is_not(NULL) is_not(c()) is_not(list()) is_not(data.frame())
Is input a string?
is_string(x)
is_string(x)
x |
Input |
TRUE if string, else FALSE
is_string(" ") is_string("data") is_string(c("data", "")) is_string(NULL) is_string(NA)
is_string(" ") is_string("data") is_string(c("data", "")) is_string(NULL) is_string(NA)
Is a variable empty
is.empty(x, empty = "\\s*")
is.empty(x, empty = "\\s*")
x |
Character value to evaluate |
empty |
Indicate what 'empty' means. Default is empty string (i.e., "") |
Is a variable empty
TRUE if empty, else FALSE
is.empty("") is.empty(NULL) is.empty(NA) is.empty(c()) is.empty("none", empty = "none") is.empty("") is.empty(" ") is.empty(" something ") is.empty(c("", "something")) is.empty(c(NA, 1:100)) is.empty(mtcars)
is.empty("") is.empty(NULL) is.empty(NA) is.empty(c()) is.empty("none", empty = "none") is.empty("") is.empty(" ") is.empty(" something ") is.empty(c("", "something")) is.empty(c(NA, 1:100)) is.empty(mtcars)
Create a vector of interaction terms for linear and logistic regression
iterms(vars, nway = 2, sep = ":")
iterms(vars, nway = 2, sep = ":")
vars |
Labels to use |
nway |
2-way (2) or 3-way (3) interaction labels to create |
sep |
Separator to use between variable names (e.g., :) |
Character vector of interaction term labels
paste0("var", 1:3) %>% iterms(2) paste0("var", 1:3) %>% iterms(3) paste0("var", 1:3) %>% iterms(2, sep = ".")
paste0("var", 1:3) %>% iterms(2) paste0("var", 1:3) %>% iterms(3) paste0("var", 1:3) %>% iterms(2, sep = ".")
Launch radiant apps
launch(package = "radiant.data", run = "viewer", state, ...)
launch(package = "radiant.data", run = "viewer", state, ...)
package |
Radiant package to start. One of "radiant.data", "radiant.design", "radiant.basics", "radiant.model", "radiant.multivariate", or "radiant" |
run |
Run a radiant app in an external browser ("browser"), an Rstudio window ("window"), or in the Rstudio viewer ("viewer") |
state |
Path to statefile to load |
... |
additional arguments to pass to shiny::runApp (e.g, port = 8080) |
See https://radiant-rstats.github.io/docs/ for radiant documentation and tutorials
## Not run: launch() launch(run = "viewer") launch(run = "window") launch(run = "browser") ## End(Not run)
## Not run: launch() launch(run = "viewer") launch(run = "window") launch(run = "browser") ## End(Not run)
Generate list of levels and unique values
level_list(dataset, ...)
level_list(dataset, ...)
dataset |
A data.frame |
... |
Unquoted variable names to evaluate |
data.frame(a = c(rep("a", 5), rep("b", 5)), b = c(rep(1, 5), 6:10)) %>% level_list() level_list(mtcars, mpg, cyl)
data.frame(a = c(rep("a", 5), rep("b", 5)), b = c(rep(1, 5), 6:10)) %>% level_list() level_list(mtcars, mpg, cyl)
Natural log
ln(x, na.rm = TRUE)
ln(x, na.rm = TRUE)
x |
Input variable |
na.rm |
Remove missing values (default is TRUE) |
Natural log of vector
ln(runif(10, 1, 2))
ln(runif(10, 1, 2))
Load data through clipboard on Windows or macOS
load_clip(delim = "\t", text, suppress = TRUE)
load_clip(delim = "\t", text, suppress = TRUE)
delim |
Delimiter to use (tab is the default) |
text |
Text input to convert to table |
suppress |
Suppress warnings |
Extract data from the clipboard into a data.frame on Windows or macOS
See the save_clip
Generate arrange commands from user input
make_arrange_cmd(expr, dataset = "")
make_arrange_cmd(expr, dataset = "")
expr |
Expression to use arrange rows from the specified dataset |
dataset |
String with dataset name |
Form arrange command from user input
Arrange command
Generate a variable used to selected a training sample
make_train(n = 0.7, nr = NULL, blocks = NULL, seed = 1234)
make_train(n = 0.7, nr = NULL, blocks = NULL, seed = 1234)
n |
Number (or fraction) of observations to label as training |
nr |
Number of rows in the dataset |
blocks |
A vector to use for blocking or a data.frame from which to construct a blocking vector |
seed |
Random seed |
0/1 variables for filtering
make_train(.5, 10) make_train(.5, 10) %>% table() make_train(100, 1000) %>% table() make_train(.15, blocks = mtcars$vs) %>% table() / nrow(mtcars) make_train(.10, blocks = iris$Species) %>% table() / nrow(iris) make_train(.5, blocks = iris[, c("Petal.Width", "Species")]) %>% table()
make_train(.5, 10) make_train(.5, 10) %>% table() make_train(100, 1000) %>% table() make_train(.15, blocks = mtcars$vs) %>% table() / nrow(mtcars) make_train(.10, blocks = iris$Species) %>% table() / nrow(iris) make_train(.5, blocks = iris[, c("Petal.Width", "Species")]) %>% table()
Convert a string of numbers into a vector
make_vec(x)
make_vec(x)
x |
A string of numbers that may include fractions |
make_vec("1 2 4") make_vec("1/2 2/3 4/5") make_vec(0.1)
make_vec("1 2 4") make_vec("1/2 2/3 4/5") make_vec(0.1)
Margin of error
me(x, conf_lev = 0.95, na.rm = TRUE)
me(x, conf_lev = 0.95, na.rm = TRUE)
x |
Input variable |
conf_lev |
Confidence level. The default is 0.95 |
na.rm |
If TRUE missing values are removed before calculation |
Margin of error
me(rnorm(100))
me(rnorm(100))
Margin of error for proportion
meprop(x, conf_lev = 0.95, na.rm = TRUE)
meprop(x, conf_lev = 0.95, na.rm = TRUE)
x |
Input variable |
conf_lev |
Confidence level. The default is 0.95 |
na.rm |
If TRUE missing values are removed before calculation |
Margin of error
meprop(c(rep(1L, 10), rep(0L, 10)))
meprop(c(rep(1L, 10), rep(0L, 10)))
Calculate the mode (modal value) and return a label
modal(x, na.rm = TRUE)
modal(x, na.rm = TRUE)
x |
A vector |
na.rm |
If TRUE missing values are removed before calculation |
From https://www.tutorialspoint.com/r/r_mean_median_mode.htm
modal(c("a", "b", "b")) modal(c(1:10, 5)) modal(as.factor(c(letters, "b"))) modal(runif(100) > 0.5)
modal(c("a", "b", "b")) modal(c(1:10, 5)) modal(as.factor(c(letters, "b"))) modal(runif(100) > 0.5)
Add ordered argument to lubridate::month
month(x, label = FALSE, abbr = TRUE, ordered = FALSE)
month(x, label = FALSE, abbr = TRUE, ordered = FALSE)
x |
Input date vector |
label |
Month as label (TRUE, FALSE) |
abbr |
Abbreviate label (TRUE, FALSE) |
ordered |
Order factor (TRUE, FALSE) |
See the month
function in the lubridate package for additional details
Add transformed variables to a data frame with the option to include a custom variable name extension
mutate_ext(.tbl, .funs, ..., .ext = "", .vars = c())
mutate_ext(.tbl, .funs, ..., .ext = "", .vars = c())
.tbl |
Data frame to add transformed variables to |
.funs |
Function(s) to apply (e.g., log) |
... |
Variables to transform |
.ext |
Extension to add for each variable |
.vars |
A list of columns generated by dplyr::vars(), or a character vector of column names, or a numeric vector of column positions. |
Wrapper for dplyr::mutate_at that allows custom variable name extensions
mutate_ext(mtcars, .funs = log, mpg, cyl, .ext = "_ln") mutate_ext(mtcars, .funs = log, .ext = "_ln") mutate_ext(mtcars, .funs = log) mutate_ext(mtcars, .funs = log, .ext = "_ln", .vars = vars(mpg, cyl))
mutate_ext(mtcars, .funs = log, mpg, cyl, .ext = "_ln") mutate_ext(mtcars, .funs = log, .ext = "_ln") mutate_ext(mtcars, .funs = log) mutate_ext(mtcars, .funs = log, .ext = "_ln", .vars = vars(mpg, cyl))
Number of missing values
n_missing(x, ...)
n_missing(x, ...)
x |
Input variable |
... |
Additional arguments |
number of missing values
n_missing(c("a", "b", NA))
n_missing(c("a", "b", NA))
Number of observations
n_obs(x, ...)
n_obs(x, ...)
x |
Input variable |
... |
Additional arguments |
number of observations
n_obs(c("a", "b", NA))
n_obs(c("a", "b", NA))
Normalize a variable x by a variable y
normalize(x, y)
normalize(x, y)
x |
Input variable |
y |
Normalizing variable |
x/y
Calculate percentiles
p01(x, na.rm = TRUE) p025(x, na.rm = TRUE) p05(x, na.rm = TRUE) p10(x, na.rm = TRUE) p25(x, na.rm = TRUE) p75(x, na.rm = TRUE) p90(x, na.rm = TRUE) p95(x, na.rm = TRUE) p975(x, na.rm = TRUE) p99(x, na.rm = TRUE)
p01(x, na.rm = TRUE) p025(x, na.rm = TRUE) p05(x, na.rm = TRUE) p10(x, na.rm = TRUE) p25(x, na.rm = TRUE) p75(x, na.rm = TRUE) p90(x, na.rm = TRUE) p95(x, na.rm = TRUE) p975(x, na.rm = TRUE) p99(x, na.rm = TRUE)
x |
Numeric vector |
na.rm |
If TRUE missing values are removed before calculation |
p01(0:100)
p01(0:100)
Parse file path into useful components
parse_path(path, chr = "", pdir = getwd(), mess = TRUE)
parse_path(path, chr = "", pdir = getwd(), mess = TRUE)
path |
Path to be parsed |
chr |
Character to wrap around path for display |
pdir |
Project directory if available |
mess |
Print messages if Dropbox or Google Drive not found |
Parse file path into useful components (i.e., file name, file extension, relative path, etc.)
list.files(".", full.names = TRUE)[1] %>% parse_path()
list.files(".", full.names = TRUE)[1] %>% parse_path()
Summarize a set of numeric vectors per row
pfun(..., fun, na.rm = TRUE) psum(..., na.rm = TRUE) pmean(..., na.rm = TRUE) pmedian(..., na.rm = TRUE) psd(..., na.rm = TRUE) pvar(..., na.rm = TRUE) pcv(..., na.rm = TRUE) pp01(..., na.rm = TRUE) pp025(..., na.rm = TRUE) pp05(..., na.rm = TRUE) pp10(..., na.rm = TRUE) pp25(..., na.rm = TRUE) pp75(..., na.rm = TRUE) pp95(..., na.rm = TRUE) pp975(..., na.rm = TRUE) pp99(..., na.rm = TRUE)
pfun(..., fun, na.rm = TRUE) psum(..., na.rm = TRUE) pmean(..., na.rm = TRUE) pmedian(..., na.rm = TRUE) psd(..., na.rm = TRUE) pvar(..., na.rm = TRUE) pcv(..., na.rm = TRUE) pp01(..., na.rm = TRUE) pp025(..., na.rm = TRUE) pp05(..., na.rm = TRUE) pp10(..., na.rm = TRUE) pp25(..., na.rm = TRUE) pp75(..., na.rm = TRUE) pp95(..., na.rm = TRUE) pp975(..., na.rm = TRUE) pp99(..., na.rm = TRUE)
... |
Numeric vectors of the same length |
fun |
Function to apply |
na.rm |
a logical indicating whether missing values should be removed. |
Calculate summary statistics of the input vectors per row (or 'parallel')
A vector of 'parallel' summaries of the argument vectors.
pfun(1:10, fun = mean) psum(1:10, 10:1)
pfun(1:10, fun = mean) psum(1:10, 10:1)
Create a pivot table
pivotr( dataset, cvars = "", nvar = "None", fun = "mean", normalize = "None", tabfilt = "", tabsort = "", tabslice = "", nr = Inf, data_filter = "", arr = "", rows = NULL, envir = parent.frame() )
pivotr( dataset, cvars = "", nvar = "None", fun = "mean", normalize = "None", tabfilt = "", tabsort = "", tabslice = "", nr = Inf, data_filter = "", arr = "", rows = NULL, envir = parent.frame() )
dataset |
Dataset to tabulate |
cvars |
Categorical variables |
nvar |
Numerical variable |
fun |
Function to apply to numerical variable |
normalize |
Normalize the table by row total, column totals, or overall total |
tabfilt |
Expression used to filter the table (e.g., "Total > 10000") |
tabsort |
Expression used to sort the table (e.g., "desc(Total)") |
tabslice |
Expression used to filter table (e.g., "1:5") |
nr |
Number of rows to display |
data_filter |
Expression used to filter the dataset before creating the table (e.g., "price > 10000") |
arr |
Expression to arrange (sort) the data on (e.g., "color, desc(price)") |
rows |
Rows to select from the specified dataset |
envir |
Environment to extract data from |
Create a pivot-table. See https://radiant-rstats.github.io/docs/data/pivotr.html for an example in Radiant
pivotr(diamonds, cvars = "cut") %>% str() pivotr(diamonds, cvars = "cut")$tab pivotr(diamonds, cvars = c("cut", "clarity", "color"))$tab pivotr(diamonds, cvars = "cut:clarity", nvar = "price")$tab pivotr(diamonds, cvars = "cut", nvar = "price")$tab pivotr(diamonds, cvars = "cut", normalize = "total")$tab
pivotr(diamonds, cvars = "cut") %>% str() pivotr(diamonds, cvars = "cut")$tab pivotr(diamonds, cvars = c("cut", "clarity", "color"))$tab pivotr(diamonds, cvars = "cut:clarity", nvar = "price")$tab pivotr(diamonds, cvars = "cut", nvar = "price")$tab pivotr(diamonds, cvars = "cut", normalize = "total")$tab
Plot method for the pivotr function
## S3 method for class 'pivotr' plot( x, type = "dodge", perc = FALSE, flip = FALSE, fillcol = "blue", opacity = 0.5, ... )
## S3 method for class 'pivotr' plot( x, type = "dodge", perc = FALSE, flip = FALSE, fillcol = "blue", opacity = 0.5, ... )
x |
Return value from |
type |
Plot type to use ("fill" or "dodge" (default)) |
perc |
Use percentage on the y-axis |
flip |
Flip the axes in a plot (FALSE or TRUE) |
fillcol |
Fill color for bar-plot when only one categorical variable has been selected (default is "blue") |
opacity |
Opacity for plot elements (0 to 1) |
... |
further arguments passed to or from other methods |
See https://radiant-rstats.github.io/docs/data/pivotr for an example in Radiant
pivotr
to generate summaries
summary.pivotr
to show summaries
pivotr(diamonds, cvars = "cut") %>% plot() pivotr(diamonds, cvars = c("cut", "clarity")) %>% plot() pivotr(diamonds, cvars = c("cut", "clarity", "color")) %>% plot()
pivotr(diamonds, cvars = "cut") %>% plot() pivotr(diamonds, cvars = c("cut", "clarity")) %>% plot() pivotr(diamonds, cvars = c("cut", "clarity", "color")) %>% plot()
Calculate proportion
prop(x, na.rm = TRUE)
prop(x, na.rm = TRUE)
x |
Input variable |
na.rm |
If TRUE missing values are removed before calculation |
Proportion of first level for a factor and of the maximum value for numeric
prop(c(rep(1L, 10), rep(0L, 10))) prop(c(rep(4, 10), rep(2, 10))) prop(rep(0, 10)) prop(factor(c(rep("a", 20), rep("b", 10))))
prop(c(rep(1L, 10), rep(0L, 10))) prop(c(rep(4, 10), rep(2, 10))) prop(rep(0, 10)) prop(factor(c(rep("a", 20), rep("b", 10))))
Comic publishers
data(publishers)
data(publishers)
A data frame with 3 rows and 2 variables
List of comic publishers from https://stat545.com/join-cheatsheet.html. The dataset is used to illustrate data merging / joining. Description provided in attr(publishers,"description")
Create a qscatter plot similar to Stata
qscatter(dataset, xvar, yvar, lev = "", fun = "mean", bins = 20)
qscatter(dataset, xvar, yvar, lev = "", fun = "mean", bins = 20)
dataset |
Data to plot (data.frame or tibble) |
xvar |
Character indicating the variable to display along the X-axis of the plot |
yvar |
Character indicating the variable to display along the Y-axis of the plot |
lev |
Level in yvar to use if yvar is of type character of factor. If lev is empty then the first level is used |
fun |
Summary measure to apply to both the x and y variable |
bins |
Number of bins to use |
qscatter(diamonds, "price", "carat") qscatter(titanic, "age", "survived")
qscatter(diamonds, "price", "carat") qscatter(titanic, "age", "survived")
Create a vector of quadratic and cubed terms for use in linear and logistic regression
qterms(vars, nway = 2)
qterms(vars, nway = 2)
vars |
Variables labels to use |
nway |
quadratic (2) or cubic (3) term labels to create |
Character vector of (regression) term labels
qterms(c("a", "b"), 3) qterms(c("a", "b"), 2)
qterms(c("a", "b"), 3) qterms(c("a", "b"), 2)
Launch the radiant.data app in the default web browser
radiant.data(state, ...)
radiant.data(state, ...)
state |
Path to statefile to load |
... |
additional arguments to pass to shiny::runApp (e.g, port = 8080) |
## Not run: radiant.data() radiant.data("https://github.com/radiant-rstats/docs/raw/gh-pages/examples/demo-dvd-rnd.state.rda") radiant.data("viewer") ## End(Not run)
## Not run: radiant.data() radiant.data("https://github.com/radiant-rstats/docs/raw/gh-pages/examples/demo-dvd-rnd.state.rda") radiant.data("viewer") ## End(Not run)
Start radiant.data app but do not open a browser
radiant.data_url(state, ...)
radiant.data_url(state, ...)
state |
Path to statefile to load |
... |
additional arguments to pass to shiny::runApp (e.g, port = 8080) |
## Not run: radiant.data_url() ## End(Not run)
## Not run: radiant.data_url() ## End(Not run)
Launch the radiant.data app in the Rstudio viewer
radiant.data_viewer(state, ...)
radiant.data_viewer(state, ...)
state |
Path to statefile to load |
... |
additional arguments to pass to shiny::runApp (e.g, port = 8080) |
## Not run: radiant.data_viewer() ## End(Not run)
## Not run: radiant.data_viewer() ## End(Not run)
Launch the radiant.data app in an Rstudio window
radiant.data_window(state, ...)
radiant.data_window(state, ...)
state |
Path to statefile to load |
... |
additional arguments to pass to shiny::runApp (e.g, port = 8080) |
## Not run: radiant.data_window() ## End(Not run)
## Not run: radiant.data_window() ## End(Not run)
These functions are provided for compatibility with previous versions of radiant but will be removed
mean_rm(...)
mean_rm(...)
... |
Parameters to be passed to the updated functions |
Replace mean_rm
by mean
Replace median_rm
by median
Replace min_rm
by min
Replace max_rm
by max
Replace sd_rm
by sd
Replace var_rm
by var
Replace sum_rm
by sum
Replace getdata
by get_data
Replace filterdata
by filter_data
Replace combinedata
by combine_data
Replace viewdata
by view_data
Replace toFct
by to_fct
Replace fixMS
by fix_smart
Replace rounddf
by round_df
Replace formatdf
by format_df
Replace formatnr
by format_nr
Replace getclass
by get_class
Replace is_numeric
by is_double
Replace is_empty
by is.empty
Generate code to read a file
read_files( path, pdir = "", type = "rmd", to = "", clipboard = TRUE, radiant = FALSE )
read_files( path, pdir = "", type = "rmd", to = "", clipboard = TRUE, radiant = FALSE )
path |
Path to file. If empty, a file browser will be opened |
pdir |
Project dir |
type |
Generate code for _Report > Rmd_ ("rmd") or _Report > R_ ("r") |
to |
Name to use for object. If empty, will use file name to derive an object name |
clipboard |
Return code to clipboard (not available on Linux) |
radiant |
Should returned code be formatted for use with other code generated by Radiant? |
Return code to read a file at the specified path. Will open a file browser if no path is provided
if (interactive()) { read_files(clipboard = FALSE) }
if (interactive()) { read_files(clipboard = FALSE) }
Remove/reorder levels
refactor(x, levs = levels(x), repl = NA)
refactor(x, levs = levels(x), repl = NA)
x |
Character or Factor |
levs |
Set of levels to use |
repl |
String (or NA) used to replace missing levels |
Keep only a specific set of levels in a factor. By removing levels the base for comparison in, e.g., regression analysis, becomes the first level. To relabel the base use, for example, repl = 'other'
refactor(diamonds$cut, c("Premium", "Ideal")) %>% head() refactor(diamonds$cut, c("Premium", "Ideal"), "Other") %>% head()
refactor(diamonds$cut, c("Premium", "Ideal")) %>% head() refactor(diamonds$cut, c("Premium", "Ideal"), "Other") %>% head()
Register a data.frame or list in Radiant
register( new, org = "", descr = "", shiny = shiny::getDefaultReactiveDomain(), envir = r_data )
register( new, org = "", descr = "", shiny = shiny::getDefaultReactiveDomain(), envir = r_data )
new |
String containing the name of the data.frame to register |
org |
Name of the original data.frame if a (working) copy is being made |
descr |
Data description in markdown format |
shiny |
Check if function is called from a shiny application |
envir |
Environment to assign data to |
See also add_description
to add a description in markdown format
to a data.frame
Base method used to render htmlwidgets
render(object, ...)
render(object, ...)
object |
Object of relevant class to render |
... |
Additional arguments |
Method to render DT tables
## S3 method for class 'datatables' render(object, shiny = shiny::getDefaultReactiveDomain(), ...)
## S3 method for class 'datatables' render(object, shiny = shiny::getDefaultReactiveDomain(), ...)
object |
DT table |
shiny |
Check if function is called from a shiny application |
... |
Additional arguments |
Method to render plotly plots
## S3 method for class 'plotly' render(object, shiny = shiny::getDefaultReactiveDomain(), ...)
## S3 method for class 'plotly' render(object, shiny = shiny::getDefaultReactiveDomain(), ...)
object |
plotly object |
shiny |
Check if function is called from a shiny application |
... |
Additional arguments |
Round doubles in a data.frame to a specified number of decimal places
round_df(tbl, dec = 3)
round_df(tbl, dec = 3)
tbl |
Data frame |
dec |
Number of decimals to show |
Data frame with rounded doubles
data.frame(x = as.factor(c("a", "b")), y = c(1L, 2L), z = c(-0.0005, 3.1)) %>% round_df(dec = 2)
data.frame(x = as.factor(c("a", "b")), y = c(1L, 2L), z = c(-0.0005, 3.1)) %>% round_df(dec = 2)
Save data to clipboard on Windows or macOS
save_clip(dataset)
save_clip(dataset)
dataset |
Dataset to save to clipboard |
Save a data.frame or tibble to the clipboard on Windows or macOS
See the load_clip
Standard deviation for the population
sdpop(x, na.rm = TRUE)
sdpop(x, na.rm = TRUE)
x |
Input variable |
na.rm |
If TRUE missing values are removed before calculation |
Standard deviation for the population
sdpop(rnorm(100))
sdpop(rnorm(100))
Standard deviation for proportion
sdprop(x, na.rm = TRUE)
sdprop(x, na.rm = TRUE)
x |
Input variable |
na.rm |
If TRUE missing values are removed before calculation |
Standard deviation for proportion
sdprop(c(rep(1L, 10), rep(0L, 10)))
sdprop(c(rep(1L, 10), rep(0L, 10)))
Standard error
se(x, na.rm = TRUE)
se(x, na.rm = TRUE)
x |
Input variable |
na.rm |
If TRUE missing values are removed before calculation |
Standard error
se(rnorm(100))
se(rnorm(100))
Search for a pattern in all columns of a data.frame
search_data(dataset, pattern, ignore.case = TRUE, fixed = FALSE)
search_data(dataset, pattern, ignore.case = TRUE, fixed = FALSE)
dataset |
Data.frame to search |
pattern |
String to match |
ignore.case |
Should search be case sensitive or not (default is FALSE) |
fixed |
Allow regular expressions or not (default is FALSE) |
See grepl
for a detailed description of the function arguments
publishers %>% filter(search_data(., "^m"))
publishers %>% filter(search_data(., "^m"))
Standard error for proportion
seprop(x, na.rm = TRUE)
seprop(x, na.rm = TRUE)
x |
Input variable |
na.rm |
If TRUE missing values are removed before calculation |
Standard error for proportion
seprop(c(rep(1L, 10), rep(0L, 10)))
seprop(c(rep(1L, 10), rep(0L, 10)))
Alias used to add an attribute
set_attr(x, which, value)
set_attr(x, which, value)
x |
Object |
which |
Attribute name |
value |
Value to set |
foo <- data.frame(price = 1:5) %>% set_attr("description", "price set in experiment ...")
foo <- data.frame(price = 1:5) %>% set_attr("description", "price set in experiment ...")
Show all rows with duplicated values (not just the first or last)
show_duplicated(.tbl, ...)
show_duplicated(.tbl, ...)
.tbl |
Data frame to add transformed variables to |
... |
Variables used to evaluate row uniqueness |
If an entire row is duplicated use "duplicated" to show only one of the duplicated rows. When using a subset of variables to establish uniqueness it may be of interest to show all rows that have (some) duplicate elements
bind_rows(mtcars, mtcars[c(1, 5, 7), ]) %>% show_duplicated(mpg, cyl) bind_rows(mtcars, mtcars[c(1, 5, 7), ]) %>% show_duplicated()
bind_rows(mtcars, mtcars[c(1, 5, 7), ]) %>% show_duplicated(mpg, cyl) bind_rows(mtcars, mtcars[c(1, 5, 7), ]) %>% show_duplicated()
Add stars based on p.values
sig_stars(pval)
sig_stars(pval)
pval |
Vector of p-values |
A vector of stars
sig_stars(c(.0009, .049, .009, .4, .09))
sig_stars(c(.0009, .049, .009, .4, .09))
Slice data with user-specified expression
slice_data(dataset, expr = NULL, drop = TRUE)
slice_data(dataset, expr = NULL, drop = TRUE)
dataset |
Data frame to slice |
expr |
Expression to use select rows from the specified dataset |
drop |
Drop unused factor levels after filtering (default is TRUE) |
Select only a slice of the data to work with
Sliced data frame
Calculate square of a variable
square(x)
square(x)
x |
Input variable |
x^2
Hide warnings and messages and return invisible
sshh(...)
sshh(...)
... |
Inputs to keep quite |
Hide warnings and messages and return invisible
sshh(library(dplyr))
sshh(library(dplyr))
Hide warnings and messages and return result
sshhr(...)
sshhr(...)
... |
Inputs to keep quite |
Hide warnings and messages and return result
sshhr(library(dplyr))
sshhr(library(dplyr))
Standardize
standardize(x, na.rm = TRUE)
standardize(x, na.rm = TRUE)
x |
Input variable |
na.rm |
If TRUE missing values are removed before calculation |
If x is a numeric variable return (x - mean(x)) / sd(x)
Method to store variables in a dataset in Radiant
store(dataset, object = "deprecated", ...)
store(dataset, object = "deprecated", ...)
dataset |
Dataset |
object |
Object of relevant class that has information to be stored |
... |
Additional arguments |
Deprecated: Store method for the explore function
## S3 method for class 'explore' store(dataset, object, name, ...)
## S3 method for class 'explore' store(dataset, object, name, ...)
dataset |
Dataset |
object |
Return value from |
name |
Name to assign to the dataset |
... |
further arguments passed to or from other methods |
Return the summarized data. See https://radiant-rstats.github.io/docs/data/explore.html for an example in Radiant
explore
to generate summaries
Deprecated: Store method for the pivotr function
## S3 method for class 'pivotr' store(dataset, object, name, ...)
## S3 method for class 'pivotr' store(dataset, object, name, ...)
dataset |
Dataset |
object |
Return value from |
name |
Name to assign to the dataset |
... |
further arguments passed to or from other methods |
Return the summarized data. See https://radiant-rstats.github.io/docs/data/pivotr.html for an example in Radiant
pivotr
to generate summaries
Work around to avoid (harmless) messages from subplot
subplot(..., margin = 0.04)
subplot(..., margin = 0.04)
... |
Arguments to pass to the |
margin |
Default margin to use between plots |
See the subplot
in the plotly package for details (?plotly::subplot)
Summary method for the explore function
## S3 method for class 'explore' summary(object, dec = 3, ...)
## S3 method for class 'explore' summary(object, dec = 3, ...)
object |
Return value from |
dec |
Number of decimals to show |
... |
further arguments passed to or from other methods |
See https://radiant-rstats.github.io/docs/data/explore.html for an example in Radiant
explore
to generate summaries
result <- explore(diamonds, "price:x") summary(result) result <- explore(diamonds, "price", byvar = "cut", fun = c("n_obs", "skew")) summary(result) explore(diamonds, "price:x", byvar = "color") %>% summary()
result <- explore(diamonds, "price:x") summary(result) result <- explore(diamonds, "price", byvar = "cut", fun = c("n_obs", "skew")) summary(result) explore(diamonds, "price:x", byvar = "color") %>% summary()
Summary method for pivotr
## S3 method for class 'pivotr' summary(object, perc = FALSE, dec = 3, chi2 = FALSE, shiny = FALSE, ...)
## S3 method for class 'pivotr' summary(object, perc = FALSE, dec = 3, chi2 = FALSE, shiny = FALSE, ...)
object |
Return value from |
perc |
Display numbers as percentages (TRUE or FALSE) |
dec |
Number of decimals to show |
chi2 |
If TRUE calculate the chi-square statistic for the (pivot) table |
shiny |
Did the function call originate inside a shiny app |
... |
further arguments passed to or from other methods |
See https://radiant-rstats.github.io/docs/data/pivotr.html for an example in Radiant
pivotr
to create the pivot-table using dplyr
pivotr(diamonds, cvars = "cut") %>% summary(chi2 = TRUE) pivotr(diamonds, cvars = "cut", tabsort = "desc(n_obs)") %>% summary() pivotr(diamonds, cvars = "cut", tabfilt = "n_obs > 700") %>% summary() pivotr(diamonds, cvars = "cut:clarity", nvar = "price") %>% summary()
pivotr(diamonds, cvars = "cut") %>% summary(chi2 = TRUE) pivotr(diamonds, cvars = "cut", tabsort = "desc(n_obs)") %>% summary() pivotr(diamonds, cvars = "cut", tabfilt = "n_obs > 700") %>% summary() pivotr(diamonds, cvars = "cut:clarity", nvar = "price") %>% summary()
Super heroes
data(superheroes)
data(superheroes)
A data frame with 7 rows and 4 variables
List of super heroes from https://stat545.com/join-cheatsheet.html. The dataset is used to illustrate data merging / joining. Description provided in attr(superheroes,"description")
Create data.frame from a table
table2data(dataset, freq = tail(colnames(dataset), 1))
table2data(dataset, freq = tail(colnames(dataset), 1))
dataset |
Data.frame |
freq |
Column name with frequency information |
data.frame(price = c("$200", "$300"), sale = c(10, 2)) %>% table2data()
data.frame(price = c("$200", "$300"), sale = c(10, 2)) %>% table2data()
Survival data for the Titanic
data(titanic)
data(titanic)
A data frame with 1043 rows and 10 variables
Survival data for the Titanic. Description provided in attr(titanic,"description")
Convert characters to factors
to_fct(dataset, safx = 30, nuniq = 100, n = 100)
to_fct(dataset, safx = 30, nuniq = 100, n = 100)
dataset |
Data frame |
safx |
Ratio of number of rows to number of unique values |
nuniq |
Cutoff for number of unique values |
n |
Cutoff for small dataset |
Convert columns of type character to factors based on a set of rules. By default columns will be converted for small datasets (<= 100 rows) with more rows than unique values. For larger datasets, columns are converted only when the number of unique values is <= 100 and there are 30 or more rows in the data for every unique value
tibble(a = c("a", "b"), b = c("a", "a"), c = 1:2) %>% to_fct()
tibble(a = c("a", "b"), b = c("a", "a"), c = 1:2) %>% to_fct()
Variance for the population
varpop(x, na.rm = TRUE)
varpop(x, na.rm = TRUE)
x |
Input variable |
na.rm |
If TRUE missing values are removed before calculation |
Variance for the population
varpop(rnorm(100))
varpop(rnorm(100))
Variance for proportion
varprop(x, na.rm = TRUE)
varprop(x, na.rm = TRUE)
x |
Input variable |
na.rm |
If TRUE missing values are removed before calculation |
Variance for proportion
varprop(c(rep(1L, 10), rep(0L, 10)))
varprop(c(rep(1L, 10), rep(0L, 10)))
View data in a shiny-app
view_data( dataset, vars = "", filt = "", arr = "", rows = NULL, na.rm = FALSE, dec = 3, envir = parent.frame() )
view_data( dataset, vars = "", filt = "", arr = "", rows = NULL, na.rm = FALSE, dec = 3, envir = parent.frame() )
dataset |
Data.frame or name of the dataframe to view |
vars |
Variables to show (default is all) |
filt |
Filter to apply to the specified dataset |
arr |
Expression to arrange (sort) data |
rows |
Select rows in the specified dataset |
na.rm |
Remove rows with missing values (default is FALSE) |
dec |
Number of decimals to show |
envir |
Environment to extract data from |
View, search, sort, etc. your data
See get_data
and filter_data
## Not run: view_data(mtcars) ## End(Not run)
## Not run: view_data(mtcars) ## End(Not run)
Visualize data using ggplot2 https://ggplot2.tidyverse.org/
visualize( dataset, xvar, yvar = "", comby = FALSE, combx = FALSE, type = ifelse(is.empty(yvar), "dist", "scatter"), nrobs = -1, facet_row = ".", facet_col = ".", color = "none", fill = "none", size = "none", fillcol = "blue", linecol = "black", pointcol = "black", bins = 10, smooth = 1, fun = "mean", check = "", axes = "", alpha = 0.5, theme = "theme_gray", base_size = 11, base_family = "", labs = list(), xlim = NULL, ylim = NULL, data_filter = "", arr = "", rows = NULL, shiny = FALSE, custom = FALSE, envir = parent.frame() )
visualize( dataset, xvar, yvar = "", comby = FALSE, combx = FALSE, type = ifelse(is.empty(yvar), "dist", "scatter"), nrobs = -1, facet_row = ".", facet_col = ".", color = "none", fill = "none", size = "none", fillcol = "blue", linecol = "black", pointcol = "black", bins = 10, smooth = 1, fun = "mean", check = "", axes = "", alpha = 0.5, theme = "theme_gray", base_size = 11, base_family = "", labs = list(), xlim = NULL, ylim = NULL, data_filter = "", arr = "", rows = NULL, shiny = FALSE, custom = FALSE, envir = parent.frame() )
dataset |
Data to plot (data.frame or tibble) |
xvar |
One or more variables to display along the X-axis of the plot |
yvar |
Variable to display along the Y-axis of the plot (default = "none") |
comby |
Combine yvars in plot (TRUE or FALSE, FALSE is the default) |
combx |
Combine xvars in plot (TRUE or FALSE, FALSE is the default) |
type |
Type of plot to create. One of Distribution ('dist'), Density ('density'), Scatter ('scatter'), Surface ('surface'), Line ('line'), Bar ('bar'), or Box-plot ('box') |
nrobs |
Number of data points to show in scatter plots (-1 for all) |
facet_row |
Create vertically arranged subplots for each level of the selected factor variable |
facet_col |
Create horizontally arranged subplots for each level of the selected factor variable |
color |
Adds color to a scatter plot to generate a 'heat map'. For a line plot one line is created for each group and each is assigned a different color |
fill |
Display bar, distribution, and density plots by group, each with a different color. Also applied to surface plots to generate a 'heat map' |
size |
Numeric variable used to scale the size of scatter-plot points |
fillcol |
Color used for bars, boxes, etc. when no color or fill variable is specified |
linecol |
Color for lines when no color variable is specified |
pointcol |
Color for points when no color variable is specified |
bins |
Number of bins used for a histogram (1 - 50) |
smooth |
Adjust the flexibility of the loess line for scatter plots |
fun |
Set the summary measure for line and bar plots when the X-variable is a factor (default is "mean"). Also used to plot an error bar in a scatter plot when the X-variable is a factor. Options are "mean" and/or "median" |
check |
Add a regression line ("line"), a loess line ("loess"), or jitter ("jitter") to a scatter plot |
axes |
Flip the axes in a plot ("flip") or apply a log transformation (base e) to the y-axis ("log_y") or the x-axis ("log_x") |
alpha |
Opacity for plot elements (0 to 1) |
theme |
ggplot theme to use (e.g., "theme_gray" or "theme_classic") |
base_size |
Base font size to use (default = 11) |
base_family |
Base font family to use (e.g., "Times" or "Helvetica") |
labs |
Labels to use for plots |
xlim |
Set limit for x-axis (e.g., c(0, 1)) |
ylim |
Set limit for y-axis (e.g., c(0, 1)) |
data_filter |
Expression used to filter the dataset. This should be a string (e.g., "price > 10000") |
arr |
Expression used to sort the data. Likely used in combination for 'rows' |
rows |
Rows to select from the specified dataset |
shiny |
Logical (TRUE, FALSE) to indicate if the function call originate inside a shiny app |
custom |
Logical (TRUE, FALSE) to indicate if ggplot object (or list of ggplot objects) should be returned. This option can be used to customize plots (e.g., add a title, change x and y labels, etc.). See examples and https://ggplot2.tidyverse.org for options. |
envir |
Environment to extract data from |
See https://radiant-rstats.github.io/docs/data/visualize.html for an example in Radiant
Generated plots
visualize(diamonds, "price:cut", type = "dist", fillcol = "red") visualize(diamonds, "carat:cut", yvar = "price", type = "scatter", pointcol = "blue", fun = c("mean", "median"), linecol = c("red", "green") ) visualize(diamonds, yvar = "price", xvar = c("cut", "clarity"), type = "bar", fun = "median" ) visualize(diamonds, yvar = "price", xvar = c("cut", "clarity"), type = "line", fun = "max" ) visualize(diamonds, yvar = "price", xvar = "carat", type = "scatter", size = "table", custom = TRUE ) + scale_size(range = c(1, 10), guide = "none") visualize(diamonds, yvar = "price", xvar = "carat", type = "scatter", custom = TRUE) + labs(title = "A scatterplot", x = "price in $") visualize(diamonds, xvar = "price:carat", custom = TRUE) %>% wrap_plots(ncol = 2) + plot_annotation(title = "Histograms") visualize(diamonds, xvar = "cut", yvar = "price", type = "bar", facet_row = "cut", fill = "cut" )
visualize(diamonds, "price:cut", type = "dist", fillcol = "red") visualize(diamonds, "carat:cut", yvar = "price", type = "scatter", pointcol = "blue", fun = c("mean", "median"), linecol = c("red", "green") ) visualize(diamonds, yvar = "price", xvar = c("cut", "clarity"), type = "bar", fun = "median" ) visualize(diamonds, yvar = "price", xvar = c("cut", "clarity"), type = "line", fun = "max" ) visualize(diamonds, yvar = "price", xvar = "carat", type = "scatter", size = "table", custom = TRUE ) + scale_size(range = c(1, 10), guide = "none") visualize(diamonds, yvar = "price", xvar = "carat", type = "scatter", custom = TRUE) + labs(title = "A scatterplot", x = "price in $") visualize(diamonds, xvar = "price:carat", custom = TRUE) %>% wrap_plots(ncol = 2) + plot_annotation(title = "Histograms") visualize(diamonds, xvar = "cut", yvar = "price", type = "bar", facet_row = "cut", fill = "cut" )
Add ordered argument to lubridate::wday
wday(x, label = FALSE, abbr = TRUE, ordered = FALSE)
wday(x, label = FALSE, abbr = TRUE, ordered = FALSE)
x |
Input date vector |
label |
Weekday as label (TRUE, FALSE) |
abbr |
Abbreviate label (TRUE, FALSE) |
ordered |
Order factor (TRUE, FALSE) |
See the lubridate::wday()
function in the lubridate package for additional details
Weighted standard deviation
weighted.sd(x, wt, na.rm = TRUE)
weighted.sd(x, wt, na.rm = TRUE)
x |
Numeric vector |
wt |
Numeric vector of weights |
na.rm |
Remove missing values (default is TRUE) |
Calculate weighted standard deviation
Index of the maximum per row
which.pmax(...)
which.pmax(...)
... |
Numeric or character vectors of the same length |
Determine the index of the maximum of the input vectors per row. Extension of which.max
Vector of rankings
See also which.max
and which.pmin
which.pmax(1:10, 10:1) which.pmax(2, 10:1) which.pmax(mtcars)
which.pmax(1:10, 10:1) which.pmax(2, 10:1) which.pmax(mtcars)
Index of the minimum per row
which.pmin(...)
which.pmin(...)
... |
Numeric or character vectors of the same length |
Determine the index of the minimum of the input vectors per row. Extension of which.min
Vector of rankings
See also which.min
and which.pmax
which.pmin(1:10, 10:1) which.pmin(2, 10:1) which.pmin(mtcars)
which.pmin(1:10, 10:1) which.pmin(2, 10:1) which.pmin(mtcars)
Workaround to store description file together with a parquet data file
write_parquet(x, file, description = attr(x, "description"))
write_parquet(x, file, description = attr(x, "description"))
x |
A data frame to write to disk |
file |
Path to store parquet file |
description |
Data description |
Split a numeric variable into a number of bins and return a vector of bin numbers
xtile(x, n = 5, rev = FALSE, type = 7)
xtile(x, n = 5, rev = FALSE, type = 7)
x |
Numeric variable |
n |
number of bins to create |
rev |
Reverse the order of the bin numbers |
type |
An integer between 1 and 9 to select one of the quantile algorithms described in the help for the stats::quantile function |
See quantile for a description of the different algorithm types
xtile(1:10, 5) xtile(1:10, 5, rev = TRUE) xtile(c(rep(1, 6), 7:10), 5)
xtile(1:10, 5) xtile(1:10, 5, rev = TRUE) xtile(c(rep(1, 6), 7:10), 5)