Title: | Data Analysis Tools Customized for TNTP |
---|---|
Description: | An assortment of functions and templates customized to meet the needs of data analysts at the non-profit organization TNTP. Includes functions for branded colors and plots, credentials management, repository set-up, and other common analytic tasks. |
Authors: | Dustin Pashouwer [aut, cre], Sam Firke [aut], Shane Orr [aut], Sam Talcott [aut] |
Maintainer: | Dustin Pashouwer <[email protected]> |
License: | CC BY 4.0 |
Version: | 1.2.0 |
Built: | 2024-11-26 20:49:58 UTC |
Source: | CRAN |
Title
.onAttach(libname, pkgname)
.onAttach(libname, pkgname)
libname |
library name |
pkgname |
package name |
Takes a user supplied data frame and turns the designated column into an N bar chart (uses position dodge from ggplot2).
bar_chart_counts( df, var, group_var = NULL, labels = "n", var_color = "green", group_colors = NULL, title = NULL, var_label = NULL, digits = 1, font = "Halyard Display", font_size = 12 )
bar_chart_counts( df, var, group_var = NULL, labels = "n", var_color = "green", group_colors = NULL, title = NULL, var_label = NULL, digits = 1, font = "Halyard Display", font_size = 12 )
df |
the data.frame to be used in the bar chart |
var |
unquoted column name for variable to count |
group_var |
(optional) unquoted column name for group variable. If this is specified, you get a 2-variable clustered bar chart. If left blank, a single variable bar chart. |
labels |
should labels show the count ( |
var_color |
color for non-grouped charts; set to TNTP green by default. For both this and |
group_colors |
character vector of group colors, if a specific palette is desired |
title |
main chart title |
var_label |
label for x-axis |
digits |
integer indicating the number of decimal places to be used in percentages. In truncating, ties are rounded up, like in MS Excel, i.e., 10.5 and 11.5 become 11 and 12. This is *not* base R's default behavior. |
font |
font for chart text; Segoe UI by default |
font_size |
size for chart text; set to 12 by default |
A ggplot object
# An N bar chart by default # All examples use font = "sans" to avoid triggering font warnings mtcars |> bar_chart_counts(var = cyl, var_color = "orange", title = "Number of mtcars by cylinder", font = "sans") # Use a grouping variable with custom colors mtcars |> bar_chart_counts(var = cyl, group_var = vs, group_colors = c("orange", "navy"), labels = "pct", title = "% of V vs. Straight engines by # of cylinders", font = "sans")
# An N bar chart by default # All examples use font = "sans" to avoid triggering font warnings mtcars |> bar_chart_counts(var = cyl, var_color = "orange", title = "Number of mtcars by cylinder", font = "sans") # Use a grouping variable with custom colors mtcars |> bar_chart_counts(var = cyl, group_var = vs, group_colors = c("orange", "navy"), labels = "pct", title = "% of V vs. Straight engines by # of cylinders", font = "sans")
This function is to be run on columns treated with check_all_recode()
.
Takes a data.frame and range of columns containing all answer choices to a check-all-that-apply question and tabulates the results. People who did not select any choices (i.e., they did not answer the question) are omitted from the denominator. For this to make sense, the question's choices should be MECE, or there should be an NA option.
This works with an "Other" open-response text field, which will be recoded to a binary variable with check_all_recode
.
check_all_count(dat, ...)
check_all_count(dat, ...)
dat |
a data.frame with survey data |
... |
unquoted column names containing the range of the answer choices. Can be specified individually, as a range, i.e., |
a data.frame with the tabulated results (n and
x <- data.frame( # 4th person didn't respond at all unrelated = 1:5, q1_1 = c("a", "a", "a", NA, NA), q1_2 = c("b", "b", NA, NA, NA), q1_3 = c(NA, NA, "c", NA, NA), q1_other = c(NA, "something else", NA, NA, "not any of these") ) x |> check_all_recode(q1_1:q1_other) |> check_all_count(q1_1:q1_other) # You can use any of the dplyr::select() helpers to identify the columns: x |> check_all_recode(contains("q1")) |> check_all_count(contains("q1"))
x <- data.frame( # 4th person didn't respond at all unrelated = 1:5, q1_1 = c("a", "a", "a", NA, NA), q1_2 = c("b", "b", NA, NA, NA), q1_3 = c(NA, NA, "c", NA, NA), q1_other = c(NA, "something else", NA, NA, "not any of these") ) x |> check_all_recode(q1_1:q1_other) |> check_all_count(q1_1:q1_other) # You can use any of the dplyr::select() helpers to identify the columns: x |> check_all_recode(contains("q1")) |> check_all_count(contains("q1"))
Some survey software returns check-all-that-apply response columns where missing values could indicate either that the respondent skipped the question entirely, or that they did not select that particular answer choice. To count the responses properly, the cases where a respondent did not check any of choices - i.e., they skipped the question - should not be counted in the denominator (assuming that the choices were completely exhaustive, or that there was an NA option).
This function takes a data.frame and range of columns containing all answer choices to a check-all-that-apply question and updates the columns in the data.frame to contain one of three values: 1 if the choice was selected; 0 if the respondent chose another option but not this one; or NA if the respondent skipped the question (i.e., they did not select any of the choices) and thus their response is truly missing.
It also takes the single text values in each column and adds them as a label
attribute to each data.frame columns.
This function accomodates an open-response column, to get the correct denominator when some respondents have skipped all check variables but written something in. This passing over of the offered choices is an implicit rejection of them, not a "missing." Such a text variable will throw a warning - which may be okay - and will then be recoded into a binary 1/0 variable indicating a response. Such a text variable will be assigned the label "Other". Consider preserving the original respondent text values prior to this point as a separate column if needed.
check_all_recode()
prepares the data.frame for a call to its sister function check_all_count()
. The label attribute is accessed by this function.
check_all_recode(dat, ..., set_labels = TRUE)
check_all_recode(dat, ..., set_labels = TRUE)
dat |
a data.frame with survey data |
... |
unquoted variable names containing the answer choices. Can be specified as a range, i.e., |
set_labels |
should the label attribute of the columns be over-written with the column text? Allow this to be TRUE unless there are currently label attributes you don't wish to overwrite. |
the original data.frame with the specified column range updated, and with label attributes on the questions.
x <- data.frame( # 4th person didn't respond at all unrelated = 1:5, q1_1 = c("a", "a", "a", NA, NA), q1_2 = c("b", "b", NA, NA, NA), q1_3 = c(NA, NA, "c", NA, NA), q1_other = c(NA, "something else", NA, NA, "not any of these") ) x |> check_all_recode(q1_1:q1_other) # You can use any of the dplyr::select() helpers to identify the columns: x |> check_all_recode(contains("q1"))
x <- data.frame( # 4th person didn't respond at all unrelated = 1:5, q1_1 = c("a", "a", "a", NA, NA), q1_2 = c("b", "b", NA, NA, NA), q1_3 = c(NA, NA, "c", NA, NA), q1_other = c(NA, "something else", NA, NA, "not any of these") ) x |> check_all_recode(q1_1:q1_other) # You can use any of the dplyr::select() helpers to identify the columns: x |> check_all_recode(contains("q1"))
Get appropriate high-contrast text colors for a vector of background colors.
This function uses the W3C contrast ratio guidance (through the
colorspace::contrast_ratio()
function) to determine the contrast,
and will raise an error if no high-enough contrast colors can be found.
choose_text_color(bg_color, text_colors = c("black", "white"), min_ratio = 4.5)
choose_text_color(bg_color, text_colors = c("black", "white"), min_ratio = 4.5)
bg_color |
a vector of colors to be used as background colors |
text_colors |
a vector of options for text colors. Defaults to "black" and "white" |
min_ratio |
Minimum contrast ratio. By default this is set to 4.5, the WCAG recommendation for regular text. |
By default, this function uses black and white as the text color options,
however custom text color options can be set with the text_colors
argument.
a vector of text colors the same length as bg_color
.
library(ggplot2) fills <- tntp_palette("top2_5") diamonds |> dplyr::summarize(m = mean(price), .by = cut) |> ggplot(aes(cut, m, fill = cut)) + geom_col() + geom_text(aes(label = scales::dollar(m), color = cut), vjust = 1.5) + scale_fill_manual(values = fills, guide = "none") + scale_color_manual(values = choose_text_color(fills), guide = "none") + tntp_style(family = "sans")
library(ggplot2) fills <- tntp_palette("top2_5") diamonds |> dplyr::summarize(m = mean(price), .by = cut) |> ggplot(aes(cut, m, fill = cut)) + geom_col() + geom_text(aes(label = scales::dollar(m), color = cut), vjust = 1.5) + scale_fill_manual(values = fills, guide = "none") + scale_color_manual(values = choose_text_color(fills), guide = "none") + tntp_style(family = "sans")
This list of colors has been superseded by the new brand colors and the new
function tntp_colors()
.
colors_tntp
colors_tntp
An object of class character
of length 34.
tntp_colors()
tntp_colors()
This likert palette has been superseded by the new brand colors and the new
function tntp_palette()
.
colors_tntp_likert
colors_tntp_likert
An object of class character
of length 7.
tntp_palette('likert_6')
tntp_palette('likert_6')
This likert palette has been superseded by the new brand colors and the new
functions tntp_colors()
and tntp_palette()
.
colors_tntp_likert_orange_to_green
colors_tntp_likert_orange_to_green
An object of class character
of length 7.
tntp_palette('bg_6')
tntp_palette('bg_6')
This list of colors has been superseded by the new brand colors and the new
function tntp_colors()
.
colors_tntp_palette
colors_tntp_palette
An object of class character
of length 16.
tntp_colors()
tntp_colors()
Checks to see if a date is past the user-specified cutoff point for delineating school years, then maps to the appropriate year.
date_to_sy(date_var, last_day_of_sy = NULL)
date_to_sy(date_var, last_day_of_sy = NULL)
date_var |
the date to convert. Can be a |
last_day_of_sy |
the cutoff date, after which a date is considered part of the following school year. The year of this argument does not matter. Defaults (noisily) to July 1st. |
Returns a character vector in the format of "2013 - 2014"
A character vector the same length as date_var
date_to_sy(as.Date("2014-05-05"), as.Date("2000-07-01")) date_to_sy(as.Date("2014-07-05"), as.Date("2000-07-01"))
date_to_sy(as.Date("2014-05-05"), as.Date("2000-07-01")) date_to_sy(as.Date("2014-07-05"), as.Date("2000-07-01"))
This function examines each column in a data.frame; when it finds a column composed solely of the values provided to the lvls
argument it updates them to be factor variables, with levels in the order provided.
This is an alternative to calling dplyr::mutate_at
with factor()
and identifying the specific variables you want to transform, if you have several repeated sets of responses.
factorize_df(dat, lvls, ignore.case = NULL)
factorize_df(dat, lvls, ignore.case = NULL)
dat |
data.frame with some factor variables stored as characters. |
lvls |
The factor levels in your variable(s), in order. If you have a question whose possible responses are a subset of another question's, don't use this function; manipulate the specific columns with |
ignore.case |
Logical. If TRUE, will match without checking case, using the capitalization from the |
a data.frame the same size as dat
, with factorization completed in place.
teacher_survey |> factorize_df(lvls = c("Strongly Disagree", "Disagree", "Somewhat Disagree", "Somewhat Agree", "Agree", "Strongly Agree")) # prints warning due to case mismatches: teacher_survey |> factorize_df(lvls = c("Strongly disagree", "Disagree", "Somewhat disagree", "Somewhat agree", "Agree", "Strongly agree"))
teacher_survey |> factorize_df(lvls = c("Strongly Disagree", "Disagree", "Somewhat Disagree", "Somewhat Agree", "Agree", "Strongly Agree")) # prints warning due to case mismatches: teacher_survey |> factorize_df(lvls = c("Strongly disagree", "Disagree", "Somewhat disagree", "Somewhat agree", "Agree", "Strongly agree"))
The Fake County synthetic panel dataset contains approximately 40,000 records comprising four years of data with roughly 10,000 teachers per year. The dataset includes information about teacher demographics, teaching assignments, salary, credentials, experience, evaluation scores, and hiring and retention status. It also includes information about school types and average student characteristics for each school. There are no real teachers in the dataset, but it is based on real data. Fake County was developed as an offshoot of the Strategic Data Project's work on human capital diagnostics for school districts and state education departments, and can be used for teaching or collaboration. The data was synthesized using the R synthpop package.
fake_county
fake_county
A data frame with 39,339 rows and 38 variables:
double: Teacher ID
double: Record Is Simulated
double: School Year
double: School Code
character: School Name
double: Teacher Is Male
double: Teacher Race/Ethnicity
double: Teacher Assignment Type
double: Monthly Salary
double: Teacher Has National Board Certification
double: Teacher Is Tenured
double: Years of Teaching Experience
double: Teacher's FTE Status
double: Teacher's Highest Degree
double: Teacher Is Licensed In STEM Field
double: Evaluation Summary Observation Score
double: Evaluation Summary Student Growth Score
double: Teacher in Same School in Following Year
double: Teacher in Different School in Following Year
double: Teacher Not Teaching in Fake County Schools in Following Year
double: Teacher Is Novice First-Year Teacher
double: Teacher Did Not Teach in Fake County in Prior Year
double: School Is Elementary School
double: School Is Middle School
double: School Is High School
double: School Is Alternative School
double: School Is Regular School
double: School Is Title 1 School
double: School Is Magnet School
double: School is Vocational School
double: School Region Code
double: School Calendar Type
double: School Special Education Student Share in 2012-15
double: School Minority Student Share in 2012-15
double: School Free and Reduced Price Lunch Student Share in 2012-15
double: School ELA Test Score Average in 2012-15 (in standard deviations)
double: School Math Test Score Average in 2012-15 (in standard deviations)
double: School Enrollment in 2015
https://github.com/OpenSDP/fake-county, posted under a Creative Commons license.
Create sequential figure numbers
figureN(x)
figureN(x)
x |
character string description of the figure |
An atomic character vector prepended with a Figure number
figureN("Distribution of cars by cylinder count") # Inline RMarkdown code: `r figureN("Distribution of cars by cylinder count")` #
figureN("Distribution of cars by cylinder count") # Inline RMarkdown code: `r figureN("Distribution of cars by cylinder count")` #
Pull extension from a path
get_ext(path)
get_ext(path)
path |
file path |
The extension as a character vector
Helper function. Checks if a given family value is available, and if not returns the default font family ("sans" or user provided)
get_usable_family(family, silent = FALSE, default_family = "sans")
get_usable_family(family, silent = FALSE, default_family = "sans")
family |
the font family to check as a character |
silent |
logical. If TRUE doesn't raise a warning if the font family is unavailable |
default_family |
defaults to "sans", but can be set to another fallback family. |
a character of a usable font family
Call this function from inside a .R file in RStudio to insert the standard TNTP header into your active script.
header_tntp()
header_tntp()
nothing
header_tntp()
header_tntp()
This function will check if Segoe UI is already accessible in R and if not
it will attempt to import it using the extrafont
package
import_segoe_ui()
import_segoe_ui()
nothing
import_segoe_ui()
import_segoe_ui()
Validate color inputs
is_color(x)
is_color(x)
x |
a color |
TRUE if x can be interpreted as a color
Test if a string is formatted like a Sharepoint drive id
is_drive_id(x)
is_drive_id(x)
x |
string to test |
TRUE or FALSE
Test if a string is formatted like a Sharepoint site id
is_site_id(x)
is_site_id(x)
x |
string to test |
TRUE or FALSE
Test if a string is formatted like a Sharepoint site url
is_site_url(x)
is_site_url(x)
x |
string to test |
TRUE or FALSE
labelled
-class columns to factors.Deprecated. Use the as_factor()
function from the haven
package instead for the same functionality.
Takes a data.frame, checks for columns that are class labelled
from the haven
package, and converts them to factor class.
labelled_to_factors(labels_df)
labelled_to_factors(labels_df)
labels_df |
a data.frame containing some columns of class labelled |
Returns a data.frame, the same size as labels_df
tntpr::fake_county |> haven::as_factor()
tntpr::fake_county |> haven::as_factor()
This list of palette names has been superseded by the new brand colors and
new functions tntp_colors()
and tntp_palette()
. To see all of the new
brand palettes, use show_tntp_palette()
.
palette_names
palette_names
An object of class character
of length 7.
show_tntp_palette()
show_tntp_palette()
This function as been superseded by tntp_colors()
which has improved
functionality and includes the most recent TNTP brand colors.
This function creates user defined color palette combinations for up to eleven colors. There are nine TNTP approved colors: dark_blue, medium_blue, light_blue, green, orange, gold, dark_grey (dark_gray), medium_grey (medium_gray), light_grey (light_gray). White and black are also available.
palette_tntp(...)
palette_tntp(...)
... |
supply quoted color names to include in color palette |
a character vector
library(ggplot2) pal1_tntp <- tntp_colors("green", "gold", "orange") pal2_tntp <- tntp_colors("navy", "cerulean", "sky") p <- ggplot(mtcars, aes(wt, mpg)) p <- p + geom_point(aes(colour = factor(cyl))) p # Change colors to created palette p <- p + scale_color_manual(values = pal1_tntp) p g <- ggplot(mtcars, aes(factor(cyl), mean(mpg))) g <- g + geom_bar(aes(fill = factor(cyl)), stat = "identity") g # Change fill to created palette g <- g + scale_fill_manual(values = pal2_tntp) g
library(ggplot2) pal1_tntp <- tntp_colors("green", "gold", "orange") pal2_tntp <- tntp_colors("navy", "cerulean", "sky") p <- ggplot(mtcars, aes(wt, mpg)) p <- p + geom_point(aes(colour = factor(cyl))) p # Change colors to created palette p <- p + scale_color_manual(values = pal1_tntp) p g <- ggplot(mtcars, aes(factor(cyl), mean(mpg))) g <- g + geom_bar(aes(fill = factor(cyl)), stat = "identity") g # Change fill to created palette g <- g + scale_fill_manual(values = pal2_tntp) g
This function has been superseded by tntp_palette()
which includes the new
brand colors.
palette_tntp_scales(palette = palette_names)
palette_tntp_scales(palette = palette_names)
palette |
the palette |
a character vector
colors <- tntp_palette("likert_5")
colors <- tntp_palette("likert_5")
Helper function for date_to_sy. Returns a date object as is, or noisily attempts to parse a string in the form YYYY-MM-DD or MM/DD/YYYY. If the date cannot be parsed, throws an error.
parse_date(date)
parse_date(date)
date |
a character or Date vector to parse |
a Date vector, the same length as 'date'
This is a modification of ggplot2::position_stack()
for creating diverging
bar charts. In order to use this function, you must set a fill aesthetic
(and that aesthetic should probably be a factor). This function will
automatically break your chart into negative and positive values and display
them in the same order as your fill levels.
position_diverge(vjust = 1, break_after = NULL, fill = FALSE, reverse = FALSE)
position_diverge(vjust = 1, break_after = NULL, fill = FALSE, reverse = FALSE)
vjust |
Vertical adjustment for geoms that have a position (like text or points), not a dimension (like bars or areas). Set to 0 to align with the bottom, 0.5 for the middle, and 1 (the default) for the top. |
break_after |
Either an integer index or character value that represents the last positive level. The default, |
fill |
If |
reverse |
If |
library(ggplot2) # Example data test_df <- tibble::tribble( ~q, ~response, ~prop, 'a', 'Yes', 0.25, 'a', 'Mostly', 0.25, 'a', 'Somewhat', 0.25, 'a', 'Not Yet', 0.25, 'b', 'Yes', 0.4, 'b', 'Mostly', 0.3, 'b', 'Somewhat', 0.2, 'b', 'Not Yet', 0.1 ) |> dplyr::mutate( response = forcats::fct_inorder(response), q = forcats::fct_inorder(q) ) # Default diverging with text # In interactive use, this can also be run with `position = "diverge"` test_df |> ggplot(aes(prop, q, fill = response)) + geom_col(position = position_diverge()) + geom_text(aes(label = scales::percent(prop,)), position = position_diverge(vjust = 0.5)) + geom_vline(xintercept = 0) + tntp_style(family = "sans") + # Reverse legend to match horizontal bar order guides(fill = guide_legend(reverse = TRUE)) + # Adjust axis labels to be positive on both sides scale_x_continuous(labels = ~scales::percent(abs(.))) # Custom breaks with the break_after parameter test_df |> ggplot(aes(q, prop, fill = response)) + geom_col(position = position_diverge(break_after = 'Yes')) + geom_hline(yintercept = 0) + tntp_style(family = "sans") + # Adjust axis labels to be positive on both sides scale_y_continuous(labels = ~scales::percent(abs(.)))
library(ggplot2) # Example data test_df <- tibble::tribble( ~q, ~response, ~prop, 'a', 'Yes', 0.25, 'a', 'Mostly', 0.25, 'a', 'Somewhat', 0.25, 'a', 'Not Yet', 0.25, 'b', 'Yes', 0.4, 'b', 'Mostly', 0.3, 'b', 'Somewhat', 0.2, 'b', 'Not Yet', 0.1 ) |> dplyr::mutate( response = forcats::fct_inorder(response), q = forcats::fct_inorder(q) ) # Default diverging with text # In interactive use, this can also be run with `position = "diverge"` test_df |> ggplot(aes(prop, q, fill = response)) + geom_col(position = position_diverge()) + geom_text(aes(label = scales::percent(prop,)), position = position_diverge(vjust = 0.5)) + geom_vline(xintercept = 0) + tntp_style(family = "sans") + # Reverse legend to match horizontal bar order guides(fill = guide_legend(reverse = TRUE)) + # Adjust axis labels to be positive on both sides scale_x_continuous(labels = ~scales::percent(abs(.))) # Custom breaks with the break_after parameter test_df |> ggplot(aes(q, prop, fill = response)) + geom_col(position = position_diverge(break_after = 'Yes')) + geom_hline(yintercept = 0) + tntp_style(family = "sans") + # Adjust axis labels to be positive on both sides scale_y_continuous(labels = ~scales::percent(abs(.)))
Determine function type by extension and provided type Handles type validation
process_type(ext, type)
process_type(ext, type)
ext |
File extension (from get_ext) |
type |
User-specified type |
a type ("dataframe" "rds" or "rdata")
Calculate the percent of non-missing values in a character vector containing the values of interest. This is a helper function for factorize_df().
prop_matching(vec, valid_strings, ignore.case = FALSE)
prop_matching(vec, valid_strings, ignore.case = FALSE)
vec |
character vector. |
valid_strings |
the values that the variable can possibly take on. |
ignore.case |
if TRUE, ignores case in matching |
a numeric proportion between 0 and 1.
Recodes a character variable into a binary result, a two-level factor. All values matching of the supplied character strings in the to_match
vector are coded into the first level of the factor; all other values are coded into the other level. NA
remains NA
. The default factor labels are "Selected" and "Not selected" but these can be overridden.
This recoding is not case-sensitive; if you specify "agree" as a top-2 value, "Agree" will be counted as Top-2, and vice versa.
recode_to_binary( x, to_match = c("strongly agree", "agree"), label_matched = "Selected", label_unmatched = "Not selected" )
recode_to_binary( x, to_match = c("strongly agree", "agree"), label_matched = "Selected", label_unmatched = "Not selected" )
x |
the character or factor vector to be recoded |
to_match |
a character vector with the strings that should be put in the first level of the factor. Defaults to "strongly agree" and "agree" but can be overwritten. |
label_matched |
what should be the factor label of values that match the strings specified in |
label_unmatched |
what should be the factor label of values that don't match the strings specified in |
a factor variable (for nicer ordering in calls to janitor::tabyl
) with values mapped to the two levels.
agreement <- c( "Strongly agree", "Agree", "Somewhat agree", "Somewhat disagree", "Strongly disagree", "Frogs", NA ) recode_to_binary(agreement) # default values of "strongly agree" and "agree" are used for recoding recode_to_binary(agreement, label_matched = "Top-2 Agree", label_unmatched = "Not in Top-2" ) # custom labels of factor levels recode_to_binary(agreement, "frogs") recode_to_binary( agreement, "frogs", "FROGS!!!", "not frogs" ) # custom matching values & labels of factor levels freq <- c("always", "often", "sometimes", "never") recode_to_binary(freq, "always", "always", "less than always")
agreement <- c( "Strongly agree", "Agree", "Somewhat agree", "Somewhat disagree", "Strongly disagree", "Frogs", NA ) recode_to_binary(agreement) # default values of "strongly agree" and "agree" are used for recoding recode_to_binary(agreement, label_matched = "Top-2 Agree", label_unmatched = "Not in Top-2" ) # custom labels of factor levels recode_to_binary(agreement, "frogs") recode_to_binary( agreement, "frogs", "FROGS!!!", "not frogs" ) # custom matching values & labels of factor levels freq <- c("always", "often", "sometimes", "never") recode_to_binary(freq, "always", "always", "less than always")
These functions are deprecated. Please use
scale_color_manual(values = tntp_palette(palette_name))
or
scale_fill_manual(values = tntp_palette(palette_name))
instead.
scale_colour_tntp(palette = palette_names, ...) scale_color_tntp(palette = palette_names, ...) scale_fill_tntp(palette = palette_names, ...)
scale_colour_tntp(palette = palette_names, ...) scale_color_tntp(palette = palette_names, ...) scale_fill_tntp(palette = palette_names, ...)
palette |
character string describing the desired palette from |
... |
other arguments to pass through to ggplot2::discrete_scale() |
a ggplot Scale object
library(ggplot2) library(dplyr) x <- mtcars |> count(cyl, am) |> mutate(am = as.factor(am)) ggplot(x, aes(x = cyl, y = n, fill = am)) + # you need a fill aesthetic geom_col() + scale_fill_manual(values = tntp_palette())
library(ggplot2) library(dplyr) x <- mtcars |> count(cyl, am) |> mutate(am = as.factor(am)) ggplot(x, aes(x = cyl, y = n, fill = am)) + # you need a fill aesthetic geom_col() + scale_fill_manual(values = tntp_palette())
internal function that calls standard formatting options for the Data Memo RMarkdown template moved here to keep the actual memo template cleaner and easier to use
set_data_memo_formatting()
set_data_memo_formatting()
nothing
set_data_memo_formatting()
set_data_memo_formatting()
Create a new repository on Bitbucket, then set your working directory to that folder and run this function. It will set up the main repo folder as well as a single subfolder in which you can work on your immediate project.
You must specify the subfolder name as well as the long name associated with that project and the analyst(s) working on it. These latter two values are used to create a README.Md file.
setup_repo(project_path, subfolder, proj_name, analyst_name)
setup_repo(project_path, subfolder, proj_name, analyst_name)
project_path |
the path to the main project directory. To use the current project, use 'project_path = here::here()'. |
subfolder |
a character vector containing the concise name of a project subfolder. E.g., if the repository is the name of a city "Anywhere City", a project subfolder might be "ela_access" or "aps_talent_landscape"). |
proj_name |
the longer, full name of the subfolder project. This will appear in the subfolder's README.md file. E.g., "Access to Grade-Level ELA Content Pilot." |
analyst_name |
the name(s) of the analysts currently working on the subfolder project. This will appear in the subfolder's README.md file. |
nothing
# Setting up in a temporary directory setup_repo(project_path = tempdir(), subfolder = "ela_access", proj_name = "Access to Grade-Level ELA Content", analyst_name = "Dustin Pashouwer and Sam Firke")
# Setting up in a temporary directory setup_repo(project_path = tempdir(), subfolder = "ela_access", proj_name = "Access to Grade-Level ELA Content", analyst_name = "Dustin Pashouwer and Sam Firke")
A repository might represent a region, like "Anywhere City", or a major client or contract, like "Midwestern Charter Network. Within that repo you would have a subfolder for each analysis project. This function creates such a subfolder and populates it with folders and a README.
To use: within an existing repository on Bitbucket, set your your working directory to that folder and run this function to create a sub-folder.
Use setup_repo()
in a blank new repository to add the first project subfolder
and create the RProject and .gitignore files. Add subsequent analysis project folders
with this function.
setup_subdirectory(project_path, subfolder, proj_name, analyst_name)
setup_subdirectory(project_path, subfolder, proj_name, analyst_name)
project_path |
the path to the main project directory. To use the current project, use 'project_path = here::here()'. |
subfolder |
a character vector containing the concise name of a project subfolder. E.g., if the repository is the name of a city "Anywhere City", a project subfolder might be "ela_access" or "aps_talent_landscape"). |
proj_name |
the longer, full name of the subfolder project. This will appear in the subfolder's README.md file. |
analyst_name |
the name(s) of the analysts currently working on the subfolder project. This will appear in the subfolder's README.md file. |
nothing
# Setting up in a temporary directory setup_subdirectory(tempdir(), subfolder = "ela_access", proj_name = "Equitable Access to Grade-Level ELA", analyst_name = "Dustin Pashouwer and Sam Firke")
# Setting up in a temporary directory setup_subdirectory(tempdir(), subfolder = "ela_access", proj_name = "Equitable Access to Grade-Level ELA", analyst_name = "Dustin Pashouwer and Sam Firke")
Write Dataframe to a temp excel file and open it.
show_in_excel(.data)
show_in_excel(.data)
.data |
Dataframe |
nothing
# View a data set in excel mtcars |> show_in_excel()
# View a data set in excel mtcars |> show_in_excel()
Check for existence of a Sharepoint folder and offer to create it if it doesn't exist
sp_check_folder(site, drive, folder_path)
sp_check_folder(site, drive, folder_path)
site |
ms_site object |
drive |
ms_drive object |
folder_path |
path to an item |
nothing
Wrapper around the $create_folder()
method from Microsoft365R::ms_drive
sp_create_folder(folder_path, site = NULL, drive = NULL)
sp_create_folder(folder_path, site = NULL, drive = NULL)
folder_path |
Path to the new folder |
site |
Site identifier. Can be the site name, id, URL, or an ms_site object. If no site identifier is provided, uses the stored default site if it exists. |
drive |
Drive identifier. Can be the drive name, id, or an ms_drive object. If site is provided but drive is not, uses the first drive of the provided site. If neither is provided, uses the stored default drive if it exists. |
returns folder_path
invisibly
# Set site/drive defaults sp_defaults("Data Analytics") # Create a folder sp_create_folder("new/folder")
# Set site/drive defaults sp_defaults("Data Analytics") # Create a folder sp_create_folder("new/folder")
Sets default site and drive for using the other sp_*()
functions.
sp_defaults(site = NULL, drive = NULL)
sp_defaults(site = NULL, drive = NULL)
site |
Site identifier. Can be the site_name, site_url, site_id, or an ms_site object. If not provided, uses the current stored default site if it exists. |
drive |
Name of the drive within the site. If site is provided but drive is not, uses the first drive of the provided site. If neither is provided, uses the stored default drive if it exists. |
No return value
sp_list()
, sp_read()
, sp_write()
, sp_site()
, sp_drive()
# Set default site sp_defaults(site = "Data Analytics") # List drives from the default site sp_list_drives() # List files/folders in the default site and drive. # Since no default drive was added, uses the first listed drive for the site. sp_list()
# Set default site sp_defaults(site = "Data Analytics") # List drives from the default site sp_list_drives() # List files/folders in the default site and drive. # Since no default drive was added, uses the first listed drive for the site. sp_list()
For parsing sharepoint errors (right now just path errors)
sp_error(cnd, path_string)
sp_error(cnd, path_string)
cnd |
condition |
path_string |
path string (from sp_string()) |
nothing
Lists site/subsite/drive/folder contents. Can be used with default site/drive set by
sp_defaults()
or with a specified site/drive.
sp_list()
lists the contents of a Sharepoint Drive or a folder.
sp_list_drives()
lists the drives contained in a Sharepoint site.
sp_list_subsites()
lists any subsites of the specified Sharepoint site.
sp_list_sites()
lists the sites you have access to. These are the sites you are following in Sharepoint
sp_list( folder = "", site = NULL, drive = NULL, pattern = NULL, full_names = FALSE, recursive = FALSE, include_dirs = FALSE ) sp_list_drives(site = NULL, pattern = NULL) sp_list_sites(pattern = NULL) sp_list_subsites(site = NULL, pattern = NULL)
sp_list( folder = "", site = NULL, drive = NULL, pattern = NULL, full_names = FALSE, recursive = FALSE, include_dirs = FALSE ) sp_list_drives(site = NULL, pattern = NULL) sp_list_sites(pattern = NULL) sp_list_subsites(site = NULL, pattern = NULL)
folder |
Path to the folder. By default, lists the top-level contents of the drive. |
site |
Site identifier. Can be the site name, id, URL, or an ms_site object. If no site identifier is provided, uses the stored default site if it exists. |
drive |
Drive identifier. Can be the drive name, id, or an ms_drive object. If site is provided but drive is not, uses the first drive of the provided site. If neither is provided, uses the stored default drive if it exists. |
pattern |
Optional regular expression. Only names which match the regular expression will be returned. |
full_names |
logical. If TRUE, the directory path is prepended to the file names to give a relative file path. If FALSE, the file names (rather than paths) are returned. |
recursive |
logical. Should the listing recurse into directories? If TRUE, full_names is also set to TRUE. |
include_dirs |
logical. Should subdirectory names be included in recursive listings? (They always are in non-recursive ones) |
A tibble with name and additional information on the relevant sites/drives/files
# List drives from the default site sp_list_drives() # List drives from a specific site sp_list_drives("Data Analytics")
# List drives from the default site sp_list_drives() # List drives from a specific site sp_list_drives("Data Analytics")
Read or write data to/from a Sharepoint drive. Can be used with default
site/drive set by sp_defaults()
or with a specified site/drive.
Currently supported file types include: .csv
, .csv2
, .tsv
, .xls
,
.xlsx
, .rds
These functions will attempt to use the appropriate read/write function based on the file extension, however this can be overridden by specifying type.
The ...
parameter is passed on to the appropriate reading or writing
function. See the details section for more information on these functions
by type.
If the folder in path
does not yet exist, the user will be prompted if they
would like to create it.
sp_read(path, site = NULL, drive = NULL, type = NULL, ...) sp_write(x, path, site = NULL, drive = NULL, type = NULL, ...)
sp_read(path, site = NULL, drive = NULL, type = NULL, ...) sp_write(x, path, site = NULL, drive = NULL, type = NULL, ...)
path |
The location in the Sharepoint drive |
site |
Site identifier. Can be the site name, id, URL, or an ms_site object. If no site identifier is provided, uses the stored default site if it exists. |
drive |
Drive identifier. Can be the drive name, id, or an ms_drive object. If site is provided but drive is not, uses the first drive of the provided site. If neither is provided, uses the stored default drive if it exists. |
type |
Optional. One of "dataframe" (for delimited files), "xlsx", or "rds". Uses the file extension to determine type if not provided. |
... |
Additional arguments passed on to the reading/writing function. |
x |
The object to be written |
sp_read()
returns an R object as specified by type. sp_write()
returns x, invisibly
For more information on methods (shown as $__()
below) see documentation
on Microsoft365R::ms_drive.
".csv", ".csv2", ".tsv" are read using the $load_dataframe()
method,
which uses readr::read_delim()
.
".rds" is read using the $load_rds()
method which accepts no additional
arguments.
".xls" and ".xlsx" are read using readxl::read_excel()
(if installed).
The function will download the excel file temporarily, then import it and
delete the temporary copy
".csv", ".csv2", ".tsv" are written using the $save_dataframe()
method
and uses readr::write_delim()
. Delimiter will be assumed by the extension
unless provided in a delim
argument
".rds" is written using the $save_rds()
method, which accepts no
additional arguments
".xlsx" is written using writexl::write_xlsx()
(if
installed) and then uploaded using the $upload_file()
method.
sp_upload()
, sp_download()
; $upload_file()
, $download_file()
, $save_rdata()
, $load_rdata()
from Microsoft365R::ms_drive
# Set site defaults sp_defaults(site = "Data Analytics") # Write a file sp_write(mtcars, "mtcars.csv") # Write a file, specifying type and adding additional parameters sp_write(mtcars, "mtcars.txt", type = "dataframe", delim = "|") # Read a file x <- sp_read("mtcars.csv") y <- sp_read("mtcars.txt", type = "dataframe", delim = "|") # Save / load an .rdata file using ms_drive methods dr <- sp_drive() # Get stored default ms_drive object dr$save_rdata(x, y, file = "data.rdata") dr$load_rdata("data.rdata")
# Set site defaults sp_defaults(site = "Data Analytics") # Write a file sp_write(mtcars, "mtcars.csv") # Write a file, specifying type and adding additional parameters sp_write(mtcars, "mtcars.txt", type = "dataframe", delim = "|") # Read a file x <- sp_read("mtcars.csv") y <- sp_read("mtcars.txt", type = "dataframe", delim = "|") # Save / load an .rdata file using ms_drive methods dr <- sp_drive() # Get stored default ms_drive object dr$save_rdata(x, y, file = "data.rdata") dr$load_rdata("data.rdata")
Internal function for reading excel files from Sharepoint
sp_read_xlsx(path, site, drive, ...)
sp_read_xlsx(path, site, drive, ...)
path |
path |
site |
ms_site object |
drive |
ms_drive object |
... |
additional arguments from sp_read() |
data read by readxl::read_excel()
Pulls the site or drive (if given) or returns the stored default. Useful if you need to use methods that aren't currently wrapped by tntpr
sp_site(site = NULL) sp_drive(drive = NULL, site = NULL)
sp_site(site = NULL) sp_drive(drive = NULL, site = NULL)
site |
Site identifier. Can be the site name, id, URL, or an ms_site object. If no site identifier is provided, uses the stored default site if it exists. |
drive |
Drive identifier. Can be the drive name, id, or an ms_drive object. If site is provided but drive is not, uses the first drive of the provided site. If neither is provided, uses the stored default drive if it exists. |
A Microsoft365R site object or drive object
Microsoft365R::ms_site, Microsoft365R::ms_drive
# Get current default site or drive x <- sp_site() y <- sp_drive() # Get specified site or drive x <- sp_site("Data Analytics") y <- sp_drive("Documents") # Uses stored default site y <- sp_drive("Documents", site = "Data Analytics") # Use provided site # Use additional methods in the site/drive objects x$get_lists() y$get_item_properties("Analysis Tools.docx")
# Get current default site or drive x <- sp_site() y <- sp_drive() # Get specified site or drive x <- sp_site("Data Analytics") y <- sp_drive("Documents") # Uses stored default site y <- sp_drive("Documents", site = "Data Analytics") # Use provided site # Use additional methods in the site/drive objects x$get_lists() y$get_item_properties("Analysis Tools.docx")
Can use either a site_name or site object, a drive_name or drive_object
sp_string( site = NULL, site_name = NULL, drive = NULL, drive_name = NULL, path = NULL )
sp_string( site = NULL, site_name = NULL, drive = NULL, drive_name = NULL, path = NULL )
site |
site object (from sp_site()) |
site_name |
site name |
drive |
drive object (from sp_drive()) |
drive_name |
drive name |
path |
path |
a cli-formatted string
sp_upload()
and sp_download()
wrap the $upload_file()
and $download_file()
methods from the Microsoft365R::ms_drive object. In addition, sp_upload()
checks for the
existence of the destination folder and will prompt the user to create it if
it doesn't exist.
sp_upload(src, dest = basename(src), site = NULL, drive = NULL) sp_download( src, dest = basename(src), site = NULL, drive = NULL, overwrite = FALSE )
sp_upload(src, dest = basename(src), site = NULL, drive = NULL) sp_download( src, dest = basename(src), site = NULL, drive = NULL, overwrite = FALSE )
src |
Location of source file. Either a local path (for |
dest |
Location of destination file. If not provided, uses the same file name as |
site |
Site identifier. Can be the site name, id, URL, or an ms_site object. If no site identifier is provided, uses the stored default site if it exists. |
drive |
Drive identifier. Can be the drive name, id, or an ms_drive object. If site is provided but drive is not, uses the first drive of the provided site. If neither is provided, uses the stored default drive if it exists. |
overwrite |
Should the destination file be overwritten if it exists? |
Returns dest
invisibly
sp_read()
, sp_write()
; $upload_file()
and $download_file()
from Microsoft365R::ms_drive
# Set site defaults sp_defaults("Data Analytics") # List files sp_list() # Download a document locally sp_download("Analysis Tools.docx", "AT.docx") # Upload a document sp_upload("AT.docx", "Analysis Tools.docx")
# Set site defaults sp_defaults("Data Analytics") # List files sp_list() # Download a document locally sp_download("Analysis Tools.docx", "AT.docx") # Upload a document sp_upload("AT.docx", "Analysis Tools.docx")
Function for uploading an xls / xlsx file
sp_write_xlsx(x, path, site, drive, ...)
sp_write_xlsx(x, path, site, drive, ...)
x |
R object |
path |
path on the Sharepoint drive |
site |
ms_site object |
drive |
ms_drive object |
... |
additional arguments passed on from sp_write() |
nothing
Helper function for factorize_df(). Returns a vector of the same length as vec, with any values that match values in valid_strings updated to the case in valid_strings
standardize_case(vec, new_case)
standardize_case(vec, new_case)
vec |
The character vector you want to update |
new_case |
A character vector of correctly cased strings |
a character vector the same length as vec
Create sequential table numbers
tableN(x)
tableN(x)
x |
character string description of the figure |
An atomic character vector prepended with a Table number
tableN("Distribution of cars by cylinder count") # Inline RMarkdown code: `r tableN("Distribution of cars by cylinder count")`
tableN("Distribution of cars by cylinder count") # Inline RMarkdown code: `r tableN("Distribution of cars by cylinder count")`
Simulated teacher survey data. Data only includes teh four TNTP high expectations questions.
teacher_survey
teacher_survey
## 'teacher survey' A data frame with 5 columns and 20 rows. The five columns are a 'timing' column, followed by four column for each of the four high expectations questions. Responses are on the 'strongly agree' to strongly disagree' 6-point scale.
simulated in 'data-raw/teacher_survey.R'
This theme is superseded by [tntp_style()]. Ggplot2 theme customized for TNTP aesthetics
theme_tntp( show_legend_title = TRUE, base_size = 12, base_family = "Segoe UI", grid_color = "grey93", title_align = "center", title_color = "black", title_size = 12, subtitle_align = "center", subtitle_color = "black", subtitle_size = 12, caption_align = "right", caption_color = "black", caption_size = 12 )
theme_tntp( show_legend_title = TRUE, base_size = 12, base_family = "Segoe UI", grid_color = "grey93", title_align = "center", title_color = "black", title_size = 12, subtitle_align = "center", subtitle_color = "black", subtitle_size = 12, caption_align = "right", caption_color = "black", caption_size = 12 )
show_legend_title |
logical. Should the legend title be shown? Leave as |
base_size |
base font size |
base_family |
base font family |
grid_color |
color for major gridlines |
title_align |
alignment of main title, defaults to "center"; also accepts "left" or "right" |
title_color |
color of title text |
title_size |
size of title text |
subtitle_align |
alignment of sub-title, defaults to "center"; also accepts "left" or "right" |
subtitle_color |
color of subtitle text |
subtitle_size |
size of subtitle text |
caption_align |
alignment of caption, defaults to "right"; also accepts "left" or "center" |
caption_color |
color of caption text |
caption_size |
size of caption text |
a ggplot theme object.
ggplot2
theme with opinionated defaults and an emphasis on typographyThis theme is superseded by tntp_style()
.
theme_tntp_2018( base_family = "Segoe UI", base_size = 11.5, plot_title_family = base_family, plot_title_size = 18, plot_title_face = "bold", plot_title_margin = 10, subtitle_family = base_family, subtitle_size = 12, subtitle_face = "plain", subtitle_margin = 15, strip_text_family = base_family, strip_text_size = 12, strip_text_face = "plain", caption_family = base_family, caption_size = 9, caption_face = "italic", caption_margin = 10, axis_text = TRUE, axis_text_size = base_size, axis_title_family = subtitle_family, axis_title_size = 9, axis_title_face = "plain", axis_title_just = "rt", plot_margin = ggplot2::margin(30, 30, 30, 30), grid_col = "grey93", grid = TRUE, axis_col = "#cccccc", axis = FALSE, ticks = FALSE )
theme_tntp_2018( base_family = "Segoe UI", base_size = 11.5, plot_title_family = base_family, plot_title_size = 18, plot_title_face = "bold", plot_title_margin = 10, subtitle_family = base_family, subtitle_size = 12, subtitle_face = "plain", subtitle_margin = 15, strip_text_family = base_family, strip_text_size = 12, strip_text_face = "plain", caption_family = base_family, caption_size = 9, caption_face = "italic", caption_margin = 10, axis_text = TRUE, axis_text_size = base_size, axis_title_family = subtitle_family, axis_title_size = 9, axis_title_face = "plain", axis_title_just = "rt", plot_margin = ggplot2::margin(30, 30, 30, 30), grid_col = "grey93", grid = TRUE, axis_col = "#cccccc", axis = FALSE, ticks = FALSE )
base_family , base_size
|
base font family and size |
plot_title_family , plot_title_face , plot_title_size , plot_title_margin
|
plot title family, face, size and margi |
subtitle_family , subtitle_face , subtitle_size
|
plot subtitle family, face and size |
subtitle_margin |
plot subtitle margin bottom (single numeric value) |
strip_text_family , strip_text_face , strip_text_size
|
facet label font family, face and size |
caption_family , caption_face , caption_size , caption_margin
|
plot caption family, face, size and margin |
axis_text |
add x or y axes text? |
axis_text_size |
font size of axis text |
axis_title_family , axis_title_face , axis_title_size
|
axis title font family, face and size |
axis_title_just |
axis title font justification, one of |
plot_margin |
plot margin (specify with |
grid_col , axis_col
|
grid & axis colors; both default to |
grid |
panel grid ( |
axis |
add x or y axes? |
ticks |
ticks if |
a ggplot theme object.
theme_tntp
The function is setup in such a way that you can customize your own one by just wrapping the call and changing the parameters. See source for examples.
There are distinctions between font names and various devices. Names that work
for display graphics devices and bitmap ones such as png
may not work well
for PostScript or PDF ones. You may need two versions of a font-based
theme function for them to work in a particular situation. This situation
usually only arises when using a newer font with many weights but somewhat
irregular internal font name patterns.
There is an option hrbrthemes.loadfonts
which – if set to TRUE
– will
call extrafont::loadfonts()
to register non-core fonts with R PDF & PostScript
devices. If you are running under Windows, the package calls the same function
to register non-core fonts with the Windows graphics device.
library(ggplot2) library(dplyr) # seminal scatterplot ggplot(mtcars, aes(mpg, wt)) + geom_point() + labs( x = "Fuel effiiency (mpg)", y = "Weight (tons)", title = "Seminal ggplot2 scatterplot example", subtitle = "A plot that is only useful for demonstration purposes", caption = "Brought to you by the letter 'g'" ) + tntp_style(family = 'sans') # seminal bar chart count(mpg, class) |> ggplot(aes(class, n)) + geom_col() + geom_text(aes(label = n), nudge_y = 3) + labs( x = "Fuel efficiency (mpg)", y = "Weight (tons)", title = "Seminal ggplot2 bar chart example", subtitle = "A plot that is only useful for demonstration purposes", caption = "Brought to you by the letter 'g'" ) + tntp_style(family = 'sans') + theme(axis.text.y = element_blank())
library(ggplot2) library(dplyr) # seminal scatterplot ggplot(mtcars, aes(mpg, wt)) + geom_point() + labs( x = "Fuel effiiency (mpg)", y = "Weight (tons)", title = "Seminal ggplot2 scatterplot example", subtitle = "A plot that is only useful for demonstration purposes", caption = "Brought to you by the letter 'g'" ) + tntp_style(family = 'sans') # seminal bar chart count(mpg, class) |> ggplot(aes(class, n)) + geom_col() + geom_text(aes(label = n), nudge_y = 3) + labs( x = "Fuel efficiency (mpg)", y = "Weight (tons)", title = "Seminal ggplot2 bar chart example", subtitle = "A plot that is only useful for demonstration purposes", caption = "Brought to you by the letter 'g'" ) + tntp_style(family = 'sans') + theme(axis.text.y = element_blank())
Translate human friendly TNTP brand color names like "medium_blue" into
accurate hex values for use in plotting. This function can also be used to
show a named vector of all available TNTP brand colors and values. Use
show_tntp_colors()
to quickly visualize selected colors in the plot window.
For often used palettes of TNTP colors, see tntp_palette()
.
tntp_colors(...) show_tntp_colors( ..., pattern = NULL, labels = TRUE, borders = NULL, cex_label = 1, ncol = NULL )
tntp_colors(...) show_tntp_colors( ..., pattern = NULL, labels = TRUE, borders = NULL, cex_label = 1, ncol = NULL )
... |
Supply quoted TNTP color names to return. If no colors are specified, returns all available colors. |
pattern |
Optional regular expression. If provided, will return only brand colors that match the regular expression |
labels |
Logical. Label colors with names and hex values? |
borders |
Border color for each tile. Default uses |
cex_label |
Size of printed labels, as multiplier of default size. |
ncol |
Number of columns. If not supplied, tries to be as square as possible. |
tntp_colors()
returns a character vector of color codes
show_tntp_colors()
returns nothing
library(ggplot2) # Use tntp_colors() to retrieve a single color... ggplot(mtcars, aes(wt, mpg)) + geom_point(color = tntp_colors('green')) #... multiple colors ... ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) + geom_point() + scale_color_manual(values = tntp_colors('green', 'navy', 'red')) #... or a list of all possible TNTP brand colors tntp_colors() # Use show_tntp_colors() to quickly see brand colors in the plotting window show_tntp_colors('mint', 'moss', 'green') # You can also use a pattern to return similar colors show_tntp_colors(pattern = 'green') # You can see all colors (and names) by running it with no arguments show_tntp_colors()
library(ggplot2) # Use tntp_colors() to retrieve a single color... ggplot(mtcars, aes(wt, mpg)) + geom_point(color = tntp_colors('green')) #... multiple colors ... ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) + geom_point() + scale_color_manual(values = tntp_colors('green', 'navy', 'red')) #... or a list of all possible TNTP brand colors tntp_colors() # Use show_tntp_colors() to quickly see brand colors in the plotting window show_tntp_colors('mint', 'moss', 'green') # You can also use a pattern to return similar colors show_tntp_colors(pattern = 'green') # You can see all colors (and names) by running it with no arguments show_tntp_colors()
A wrapper around the keyring
package for secure credential management.
tntp_cred()
will attempt to get a credential, and if no credential is found
it will prompt you to add it (and then return it).
tntp_cred_set()
will set a credential. By default it will prompt before
overwriting any current credentials.
tntp_cred_list()
will list all current credentials by sorted by service
and username.
tntp_cred(service, username = NULL, keyring = NULL, prompt = NULL) tntp_cred_set( service = NULL, username = NULL, keyring = NULL, prompt = NULL, overwrite = NULL ) tntp_cred_list(service = NULL, keyring = NULL)
tntp_cred(service, username = NULL, keyring = NULL, prompt = NULL) tntp_cred_set( service = NULL, username = NULL, keyring = NULL, prompt = NULL, overwrite = NULL ) tntp_cred_list(service = NULL, keyring = NULL)
service |
The identifier for the credential you are pulling or setting |
username |
OPTIONAL. Can be used to specify different usernames for the same service |
keyring |
OPTIONAL. Can be used to specify a specific keyring |
prompt |
OPTIONAL. What text should be displayed above the input box for the key while setting? |
overwrite |
OPTIONAL. By default, |
tntp_cred()
returns a stored (or newly created) credential
tntp_cred_set()
returns nothing
tntp_cred_list()
returns a 2-column data frame of services and usernames
# Using tntp_cred() with qualtRics library(qualtRics) # If no credential is set, this command will prompt for it first qualtrics_token <- tntp_cred("QUALTRICS_TOKEN") qualtrics_api_credentials(api_key = qualtrics_token, base_url = 'tntp.co1.qualtrics.com') # To overwrite your Qualtrics credential tntp_cred("QUALTRICS_TOKEN", .set = TRUE)
# Using tntp_cred() with qualtRics library(qualtRics) # If no credential is set, this command will prompt for it first qualtrics_token <- tntp_cred("QUALTRICS_TOKEN") qualtrics_api_credentials(api_key = qualtrics_token, base_url = 'tntp.co1.qualtrics.com') # To overwrite your Qualtrics credential tntp_cred("QUALTRICS_TOKEN", .set = TRUE)
Use or see
tntp_palette(palette = "likert_6", reverse = FALSE) show_tntp_palette(..., reverse = FALSE, pattern = NULL)
tntp_palette(palette = "likert_6", reverse = FALSE) show_tntp_palette(..., reverse = FALSE, pattern = NULL)
palette |
Name of the TNTP palette you want to use. To see all available palettes, use |
reverse |
Logical. If set to |
... |
Supply quoted TNTP palette names to visualize. If no names are specified, shows all available palettes. |
pattern |
Optional regular expression. If provided, will return only palettes that match the regular expression |
tntp_palette()
returns a character vector of color codes
show_tntp_palette()
returns nothing
library(ggplot2) # Use to add a common palette to a ggplot visualization ggplot(diamonds, aes(y = color, fill = cut)) + geom_bar(position = "fill") + scale_fill_manual(values = tntp_palette('blues', reverse = TRUE)) # Use show_tntp_palette() to visualize a single or multiple palettes show_tntp_palette('likert_7') show_tntp_palette('bg_5', 'likert_5') # You can use a pattern to show similar palettes show_tntp_palette(pattern = 'top2') show_tntp_palette(pattern = '_6') # Or run it with no specified palettes to see all available palettes show_tntp_palette() # For creating a continuous color palette, use scale_color_gradient() # along with tntp_colors(): ggplot(mtcars, aes(hp, disp, color = mpg)) + geom_point(size = 3) + scale_color_gradient(low = tntp_colors('red'), high = tntp_colors('green'))
library(ggplot2) # Use to add a common palette to a ggplot visualization ggplot(diamonds, aes(y = color, fill = cut)) + geom_bar(position = "fill") + scale_fill_manual(values = tntp_palette('blues', reverse = TRUE)) # Use show_tntp_palette() to visualize a single or multiple palettes show_tntp_palette('likert_7') show_tntp_palette('bg_5', 'likert_5') # You can use a pattern to show similar palettes show_tntp_palette(pattern = 'top2') show_tntp_palette(pattern = '_6') # Or run it with no specified palettes to see all available palettes show_tntp_palette() # For creating a continuous color palette, use scale_color_gradient() # along with tntp_colors(): ggplot(mtcars, aes(hp, disp, color = mpg)) + geom_point(size = 3) + scale_color_gradient(low = tntp_colors('red'), high = tntp_colors('green'))
A custom theme including TNTP fonts and other defaults for styling ggplot2 charts.
tntp_style( family = "Halyard Display", header_family = family, base_size = 28, text_color = "#222222", caption_color = "#7D7E81", show_legend_title = FALSE, show_axis_titles = FALSE, grid = FALSE, grid_color = "#CBCBCB", title_align = "left", legend_align = "left", caption_align = "right" )
tntp_style( family = "Halyard Display", header_family = family, base_size = 28, text_color = "#222222", caption_color = "#7D7E81", show_legend_title = FALSE, show_axis_titles = FALSE, grid = FALSE, grid_color = "#CBCBCB", title_align = "left", legend_align = "left", caption_align = "right" )
family |
Base font family. Defaults to "Halyard Display". |
header_family |
Font family for title and subtitle. Defaults to the base font family. |
base_size |
Base font size. Recommended minimum value of 15. |
text_color |
Text color for titles, axes, legends, and facets. |
caption_color |
Text color for caption. |
show_legend_title |
Logical. Should the legend title be shown? Leave as |
show_axis_titles |
Which axis titles should be shown? Use |
grid |
Which grid lines should be shown? Use |
grid_color |
Grid line color. |
title_align , legend_align , caption_align
|
Alignment of title, legend, and caption. Accepts |
a ggplot theme object.
library(dplyr) library(ggplot2) fake_county |> filter(t_salary > 0) |> ggplot(aes(t_experience, t_salary)) + geom_point() + scale_y_continuous(labels = scales::dollar) + labs( title = "Salary Increases with Experience", subtitle = "With significant variation at all levels", x = "Years of Experience", caption = "Data from the Fake County Data Set" ) + tntp_style(family = 'sans', show_axis_titles = "x") frpl_experience <- fake_county |> mutate(frpl_bucket = cut(sch_frpl_pct, breaks = c(0, 20, 40, 60, 80, 100), labels = c("0-20%", "20-40%", "40-60%", "60-80%", "80-100%") )) |> group_by(frpl_bucket) |> summarize(avg_experience = mean(t_experience, na.rm = TRUE)) |> mutate( label = as.character(round(avg_experience, digits = 1)), label = if_else(frpl_bucket == "0-20%", paste0(label, "\nYears of\nExperience"), label) ) frpl_experience |> ggplot(aes(frpl_bucket, avg_experience)) + geom_col(fill = if_else(frpl_experience$frpl_bucket == "60-80%", tntp_colors("tangerine"), tntp_colors("medium_gray") )) + geom_text(aes(label = label), nudge_y = -0.25, vjust = 1, color = "white", size = 5, lineheight = 1 ) + labs( title = "High Poverty Schools have Less Experienced Teachers", x = "% of Student Body Receiving Free/Reduced Lunch" ) + scale_y_continuous(breaks = seq(0, 20, 4)) + tntp_style( family = "sans", base_size = 20, show_axis_titles = "x" )
library(dplyr) library(ggplot2) fake_county |> filter(t_salary > 0) |> ggplot(aes(t_experience, t_salary)) + geom_point() + scale_y_continuous(labels = scales::dollar) + labs( title = "Salary Increases with Experience", subtitle = "With significant variation at all levels", x = "Years of Experience", caption = "Data from the Fake County Data Set" ) + tntp_style(family = 'sans', show_axis_titles = "x") frpl_experience <- fake_county |> mutate(frpl_bucket = cut(sch_frpl_pct, breaks = c(0, 20, 40, 60, 80, 100), labels = c("0-20%", "20-40%", "40-60%", "60-80%", "80-100%") )) |> group_by(frpl_bucket) |> summarize(avg_experience = mean(t_experience, na.rm = TRUE)) |> mutate( label = as.character(round(avg_experience, digits = 1)), label = if_else(frpl_bucket == "0-20%", paste0(label, "\nYears of\nExperience"), label) ) frpl_experience |> ggplot(aes(frpl_bucket, avg_experience)) + geom_col(fill = if_else(frpl_experience$frpl_bucket == "60-80%", tntp_colors("tangerine"), tntp_colors("medium_gray") )) + geom_text(aes(label = label), nudge_y = -0.25, vjust = 1, color = "white", size = 5, lineheight = 1 ) + labs( title = "High Poverty Schools have Less Experienced Teachers", x = "% of Student Body Receiving Free/Reduced Lunch" ) + scale_y_continuous(breaks = seq(0, 20, 4)) + tntp_style( family = "sans", base_size = 20, show_axis_titles = "x" )
An in-house TNTP R package. Includes tools for data manipulation, analysis, and reporting, including making TNTP-themed charts and documents. By and for TNTP data-using staff, though available to the broader public.
Maintainer: Dustin Pashouwer [email protected]
Authors:
Sam Firke
Shane Orr [email protected]
Sam Talcott [email protected]
Useful links:
Updates [ggplot2::geom_label] and [ggplot2::geom_text] font defaults
update_geom_font_defaults( family = "Segoe UI", face = "plain", size = 3.5, color = "#2b2b2b" )
update_geom_font_defaults( family = "Segoe UI", face = "plain", size = 3.5, color = "#2b2b2b" )
family , face , size , color
|
font family name, face, size and color |
nothing
# Update text geoms to use Arial font update_geom_font_defaults(family = 'Arial')
# Update text geoms to use Arial font update_geom_font_defaults(family = 'Arial')
Re-install the tntpr package from GitHub.
update_tntpr()
update_tntpr()
nothing
# Run without loading tntpr first tntpr::update_tntpr()
# Run without loading tntpr first tntpr::update_tntpr()
A generated data set containing data on 1200 imaginary individual K-12 students in Wisconsin. They are nested within 6 schools in 3 districts. In adapting this from the source, Sam switched the school and district variables (there had been multiple districts per school) and made other minor changes, including dropping columns that I didn't understand or that didn't seem relevant (e.g., variables like "luck" that were used to calculate the reading and math scores).
wisc
wisc
A data frame with 2700 rows and 26 variables:
numeric: student's unique ID #
numeric: grade level
numeric: district code
numeric: school code
numeric: is the student white?
numeric: is the student black?
numeric: is the student Hispanic?
numeric: is the student Native-American Indian?
numeric: is the student Asian?
numeric: is the student economically-disadvantaged?
numeric: is the student female?
numeric: is the student an English Language Learner?
numeric: does the student have a learning disability?
numeric: school year
numeric: days attended
numeric: student's reading standardized test score
numeric: student's math standardized test score
factor: student's proficiency level
factor: student's single-category race
...
https://github.com/jknowles/r_tutorial_ed/, posted under a Creative Commons license. The script used to generate the data set is here, although not very well documented: https://github.com/jknowles/r_tutorial_ed/blob/master/data/simulate_data.R