Title: | A Collection of Large Language Model Assistants |
---|---|
Description: | Provides a collection of ergonomic large language model assistants designed to help you complete repetitive, hard-to-automate tasks quickly. After selecting some code, press the keyboard shortcut you've chosen to trigger the package app, select an assistant, and watch your chore be carried out. While the package ships with a number of chore helpers for R package development, users can create custom helpers just by writing some instructions in a markdown file. |
Authors: | Simon Couch [aut, cre]
|
Maintainer: | Simon Couch <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.0 |
Built: | 2025-02-21 13:30:47 UTC |
Source: | CRAN |
The chores addin allows users to interactively select a chore helper to
interface with the current selection. This function is not
intended to be interfaced with in regular usage of the package.
To launch the chores addin in RStudio, navigate to Addins > Chores
and/or register the addin with a shortcut via
Tools > Modify Keyboard Shortcuts > Search "Chores"
–we suggest Ctrl+Alt+C
(or Ctrl+Cmd+C
on macOS).
.init_addin()
.init_addin()
NULL
, invisibly. Called for the side effect of launching the chores addin
and interfacing with selected text.
if (interactive()) { .init_addin() }
if (interactive()) { .init_addin() }
Users typically should not need to call this function.
Create new helpers that will automatically be registered with this function
with prompt_new()
.
The chores addin will initialize needed helpers on-the-fly.
.init_helper(chore = NULL, .chores_chat = getOption(".chores_chat"))
.init_helper(chore = NULL, .chores_chat = getOption(".chores_chat"))
chore |
The identifier for a helper prompt. By default one
of "cli", "testthat" or "roxygen",
though custom helpers can be added with |
.chores_chat |
An ellmer Chat, e.g.
|
A Helper object, which is a subclass of an ellmer chat.
# requires an API key and sets options ## Not run: # to create a chat with claude: .init_helper(.chores_chat = ellmer::chat_claude()) # or with OpenAI's 4o-mini: .init_helper(.chores_chat = ellmer::chat_openai(model = "gpt-4o-mini")) # to set OpenAI's 4o-mini as the default model powering chores, for example, # set the following option (possibly in your .Rprofile, if you'd like # them to persist across sessions): options( .chores_chat = ellmer::chat_openai(model = "gpt-4o-mini") ) ## End(Not run)
# requires an API key and sets options ## Not run: # to create a chat with claude: .init_helper(.chores_chat = ellmer::chat_claude()) # or with OpenAI's 4o-mini: .init_helper(.chores_chat = ellmer::chat_openai(model = "gpt-4o-mini")) # to set OpenAI's 4o-mini as the default model powering chores, for example, # set the following option (possibly in your .Rprofile, if you'd like # them to persist across sessions): options( .chores_chat = ellmer::chat_openai(model = "gpt-4o-mini") ) ## End(Not run)
A couple years ago, the tidyverse team began migrating to the cli R package
for raising errors, transitioning away from base R (e.g. stop()
),
rlang (e.g. rlang::abort()
), glue, and homegrown combinations of them.
cli's new syntax is easier to work with as a developer and more visually
pleasing as a user.
In some cases, transitioning is as simple as Finding + Replacing
rlang::abort()
to cli::cli_abort()
. In others, there's a mess of
ad-hoc pluralization, paste0()
s, glue interpolations, and other
assorted nonsense to sort through. Total pain, especially with thousands
upon thousands of error messages thrown across the tidyverse, r-lib, and
tidymodels organizations.
The cli helper helps you convert your R package to use cli for error messages.
The system prompt for a cli helper includes something like 4,000 tokens. Add in (a generous) 100 tokens for the code that's actually highlighted and also sent off to the model and you're looking at 4,100 input tokens. The model returns approximately the same number of output tokens as it receives, so we'll call that 100 output tokens per refactor.
As of the time of writing (October 2024), the recommended chores model Claude Sonnet 3.5 costs $3 per million input tokens and $15 per million output tokens. So, using the recommended model, cli helpers cost around $15 for every 1,000 refactored pieces of code. GPT-4o Mini, by contrast, doesn't tend to get cli markup classes right but does return syntactically valid calls to cli functions, and it would cost around 75 cents per 1,000 refactored pieces of code.
This section includes a handful of examples "from the wild" and are generated with the recommended model, Claude Sonnet 3.5.
At its simplest, a one-line message with a little bit of markup:
rlang::abort("`save_pred` can only be used if the initial results saved predictions.")
Returns:
cli::cli_abort("{.arg save_pred} can only be used if the initial results saved predictions.")
Some strange vector collapsing and funky line breaking:
extra_grid_params <- glue::single_quote(extra_grid_params) extra_grid_params <- glue::glue_collapse(extra_grid_params, sep = ", ") msg <- glue::glue( "The provided `grid` has the following parameter columns that have ", "not been marked for tuning by `tune()`: {extra_grid_params}." ) rlang::abort(msg)
Returns:
cli::cli_abort( "The provided {.arg grid} has parameter columns that have not been marked for tuning by {.fn tune}: {.val {extra_grid_params}}." )
A message that probably best lives as two separate elements:
rlang::abort( paste( "Some model parameters require finalization but there are recipe", "parameters that require tuning. Please use ", "`extract_parameter_set_dials()` to set parameter ranges ", "manually and supply the output to the `param_info` argument." ) )
Returns:
cli::cli_abort( c( "Some model parameters require finalization but there are recipe parameters that require tuning.", "i" = "Please use {.fn extract_parameter_set_dials} to set parameter ranges manually and supply the output to the {.arg param_info} argument." ) )
Gnarly ad-hoc pluralization:
msg <- "Creating pre-processing data to finalize unknown parameter" unk_names <- pset$id[unk] if (length(unk_names) == 1) { msg <- paste0(msg, ": ", unk_names) } else { msg <- paste0(msg, "s: ", paste0("'", unk_names, "'", collapse = ", ")) } rlang::inform(msg)
Returns:
cli::cli_inform( "Creating pre-processing data to finalize unknown parameter{?s}: {.val {unk_names}}" )
Some paste0()
wonk:
rlang::abort(paste0( "The workflow has arguments to be tuned that are missing some ", "parameter objects: ", paste0("'", pset$id[!params], "'", collapse = ", ") ))
Returns:
cli::cli_abort( "The workflow has arguments to be tuned that are missing some parameter objects: {.val {pset$id[!params]}}" )
The model is instructed to only return a call to a cli function, so erroring code that's run conditionally can get borked:
cls <- paste(cls, collapse = " or ") if (!fine) { msg <- glue::glue("Argument '{deparse(cl$x)}' should be a {cls} or NULL") if (!is.null(where)) { msg <- glue::glue(msg, " in `{where}`") } rlang::abort(msg) }
Returns:
cli::cli_abort( "Argument {.code {deparse(cl$x)}} should be {?a/an} {.cls {cls}} or {.code NULL}{?in {where}}." )
Note that ?in where
is not valid cli markup.
Sprintf-style statements aren't an issue:
abort(sprintf("No such '%s' function: `%s()`.", package, name))
Returns:
cli::cli_abort("No such {.pkg {package}} function: {.fn {name}}.")
Chore helpers are typically interfaced with via the chores addin. To call the cli helper directly, use:
helper_cli <- .init_helper("cli")
Then, to submit a query, run:
helper_cli$chat({x})
The chores package's prompt directory is a directory of markdown files that
is automatically registered with the chores package on package load.
directory_*()
functions allow users to interface with the directory,
making new "chores" available:
directory_path()
returns the path to the prompt directory.
directory_set()
changes the path to the prompt directory (by setting
the option .chores_dir
).
directory_list()
enumerates all of the different prompts that currently
live in the directory (and provides clickable links to each).
directory_load()
registers each of the prompts in the prompt
directory with the chores package.
Functions prefixed with prompt*()
allow users to conveniently create, edit,
and delete the prompts in chores' prompt directory.
directory_load(dir = directory_path()) directory_list() directory_path() directory_set(dir)
directory_load(dir = directory_path()) directory_list() directory_path() directory_set(dir)
dir |
Path to a directory of markdown files–see |
directory_path()
returns the path to the prompt directory (which is
not created by default unless explicitly requested by the user).
directory_set()
return the path to the new prompt directory.
directory_list()
returns the file paths of all of the prompts that
currently live in the directory (and provides clickable links to each).
directory_load()
returns NULL
invisibly.
Prompts are markdown files with the
name chore-interface.md
, where interface is one of
"replace", "prefix" or "suffix".
An example directory might look like:
/ |-- .config/ | |-- chores/ | |-- proofread-replace.md | |-- summarize-prefix.md
In that case, chores will register two custom helpers when you call library(chores)
.
One of them is for the "proofread" chore and will replace the selected text with
a proofread version (according to the instructions contained in the markdown
file itself). The other is for the "summarize" chore and will prefix the selected
text with a summarized version (again, according to the markdown file's
instructions). Note:
Files without a .md
extension are ignored.
Files with a .md
extension must contain only one hyphen in their filename,
and the text following the hyphen must be one of replace
, prefix
, or
suffix
.
To load custom prompts every time the package is loaded, place your
prompts in directory_path()
. To change the prompt directory without
loading the package, just set the .chores_dir
option with
options(.chores_dir = some_dir)
. To load a directory of files that's not
the prompt directory, provide a dir
argument to directory_load()
.
The "Custom helpers" vignette, at
vignette("custom", package = "chores")
,for more on adding your own
helper prompts, sharing them with others, and using prompts from others.
# choose a path for the prompt directory tmp_dir <- withr::local_tempdir() directory_set(tmp_dir) # print out the current prompt directory directory_path() # list out prompts currently in the directory directory_list() # create a prompt in the prompt directory prompt_new("boop", "replace") # view updated list of prompts directory_list() # register the prompt with the package # (this will also happen automatically on reload) directory_load()
# choose a path for the prompt directory tmp_dir <- withr::local_tempdir() directory_set(tmp_dir) # print out the current prompt directory directory_path() # list out prompts currently in the directory directory_list() # create a prompt in the prompt directory prompt_new("boop", "replace") # view updated list of prompts directory_list() # register the prompt with the package # (this will also happen automatically on reload) directory_load()
The chores package makes use of three notable user-facing options:
.chores_dir
is the directory where helper prompts live. See the helper directory
help-page for more information.
.chores_chat
determines the underlying LLM powering each helper.
See the "Choosing a model" section of vignette("chores", package = "chores")
for more information.
The chores package provides a number of tools for working on system prompts. System prompts are what instruct helpers on how to behave and provide information to live in the models' "short-term memory."
prompt_*()
functions allow users to conveniently create, edit, remove,
the prompts in chores' "prompt directory."
prompt_new()
creates a new markdown file that will automatically
create a helper with the specified chore, prompt, and interface on package load.
Specify a contents
argument to prefill with contents from a markdown file
on your computer or the web.
prompt_edit()
and prompt_remove()
open and delete, respectively, the
file that defines the given chore's system prompt.
Load the prompts you create with these functions using directory_load()
(which is automatically called when the package loads).
prompt_new(chore, interface, contents = NULL) prompt_remove(chore) prompt_edit(chore)
prompt_new(chore, interface, contents = NULL) prompt_remove(chore) prompt_edit(chore)
chore |
A single string giving a descriptor of the helper's functionality. Cand only contain letters and numbers. |
interface |
One of |
contents |
Optional. Path to a markdown file with contents that will
"pre-fill" the file. Anything file ending in |
Each prompt_*()
function returns the file path to the created, edited, or
removed filepath, invisibly.
The directory help-page for more on working with prompts in
batch using directory_*()
functions, and vignette("custom", package = "chores")
for more on sharing helper prompts and using prompts from others.
if (interactive()) { # create a new helper for chore `"boop"` that replaces the selected text: prompt_new("boop") # after writing a prompt, register it with the chores package with: directory_load() # after closing the file, reopen with: prompt_edit("boop") # remove the prompt (next time the package is loaded) with: prompt_remove("boop") # pull prompts from files on local drives or the web with # `prompt_new(contents)`. for example, here is a GitHub Gist: # paste0( # "https://gist.githubusercontent.com/simonpcouch/", # "daaa6c4155918d6f3efd6706d022e584/raw/ed1da68b3f38a25b58dd9fdc8b9c258d", # "58c9b4da/summarize-prefix.md" # ) # # press "Raw" and then supply that URL as `contents` (you don't actually # have to use the paste0() to write out the URL--we're just keeping # the characters per line under 80): prompt_new( chore = "summarize", interface = "prefix", contents = paste0( "https://gist.githubusercontent.com/simonpcouch/", "daaa6c4155918d6f3efd6706d022e584/raw/ed1da68b3f38a25b58dd9fdc8b9c258d", "58c9b4da/summarize-prefix.md" ) ) }
if (interactive()) { # create a new helper for chore `"boop"` that replaces the selected text: prompt_new("boop") # after writing a prompt, register it with the chores package with: directory_load() # after closing the file, reopen with: prompt_edit("boop") # remove the prompt (next time the package is loaded) with: prompt_remove("boop") # pull prompts from files on local drives or the web with # `prompt_new(contents)`. for example, here is a GitHub Gist: # paste0( # "https://gist.githubusercontent.com/simonpcouch/", # "daaa6c4155918d6f3efd6706d022e584/raw/ed1da68b3f38a25b58dd9fdc8b9c258d", # "58c9b4da/summarize-prefix.md" # ) # # press "Raw" and then supply that URL as `contents` (you don't actually # have to use the paste0() to write out the URL--we're just keeping # the characters per line under 80): prompt_new( chore = "summarize", interface = "prefix", contents = paste0( "https://gist.githubusercontent.com/simonpcouch/", "daaa6c4155918d6f3efd6706d022e584/raw/ed1da68b3f38a25b58dd9fdc8b9c258d", "58c9b4da/summarize-prefix.md" ) ) }
The roxygen helper prefixes the selected function with a minimal roxygen2 documentation template. The helper is instructed to only generate a subset of a complete documentation entry, to be then completed by a developer:
Stub @param
descriptions based on defaults and inferred types
Stub @returns
entry that describes the return value as well as important
errors and warnings users might encounter.
The system prompt from a roxygen helper includes something like 1,000 tokens. Add in 200 tokens for the code that's actually highlighted and also sent off to the model and you're looking at 1,200 input tokens. The model returns maybe 10 to 15 lines of relatively barebones royxgen documentation, so we'll call that 200 output tokens per refactor.
As of the time of writing (October 2024), the recommended chores model Claude Sonnet 3.5 costs $3 per million input tokens and $15 per million output tokens. So, using the recommended model, roxygen helpers cost around $4 for every 1,000 generated roxygen documentation entries. GPT-4o Mini, by contrast, doesn't tend to infer argument types correctly as often and often fails to line-break properly, but does usually return syntactically valid documentation entries, and it would cost around 20 cents per 1,000 generated roxygen documentation entries.
This section includes a handful of examples "from the wild" and are generated with the recommended model, Claude Sonnet 3.5.
Documenting a function factory:
deferred_method_transform <- function(lambda_expr, transformer, eval_env) { transformer <- enexpr(transformer) force(eval_env) unique_id <- new_id() env_bind_lazy( generators, !!unique_id := inject((!!transformer)(!!lambda_expr)), eval.env = eval_env ) inject( function(...) { (!!generators)[[!!unique_id]](self, private, ...) } ) }
Returns:
#' #' Transform a deferred method #' #' @description #' A short description... #' #' @param lambda_expr A lambda expression to transform. #' @param transformer A transformer function or expression. #' @param eval_env The environment in which to evaluate the transformer. #' #' @returns #' A function that, when called, will evaluate the transformed lambda expression. #' The returned function accepts `...` arguments which are passed to the generated function. #' #' @export
A function that may raise a condition:
set_default <- function(value, default, arg = caller_arg(value)) { if (is.null(value)) { if (!is_testing() || is_snapshot()) { cli::cli_inform("Using {.field {arg}} = {.val {default}}.") } default } else { value } }
Returns:
#' Set default value #' #' @description #' A short description... #' #' @param value A value to check. #' @param default The default value to use if `value` is NULL. #' @param arg Optional. The name of the argument being set. #' #' @returns #' Returns `value` if it's not NULL, otherwise returns `default`. #' Informs the user when using the default value. #' #' @export
A function with some tricky indexing:
find_index <- function(left, e_right) { if (!is.list(e_right) || !has_name(e_right, "index") || !is.numeric(e_right$index)) { return(NA) } matches_idx <- map_lgl(left, function(e_left) e_left$index == e_right$index) if (sum(matches_idx) != 1) { return(NA) } which(matches_idx)[[1]] }
Returns:
#' Find matching index #' #' @description #' A short description... #' #' @param left A list of elements, each expected to have an 'index' field. #' @param e_right A list with an 'index' field to search for in `left`. #' #' @returns #' The numeric index in `left` where `e_right$index` matches, or NA if not found #' or if inputs are invalid. Returns NA if multiple matches are found. #' #' @export
Chore helpers are typically interfaced with via the chores addin. To call the roxygen helper directly, use:
helper_roxygen <- .init_helper("roxygen")
Then, to submit a query, run:
helper_roxygen$chat({x})
testthat 3.0.0 was released in 2020, bringing with it numerous changes that were both huge quality of life improvements for package developers and also highly breaking changes.
While some of the task of converting legacy unit testing code to testthat 3e is quite is pretty straightforward, other components can be quite tedious. The testthat helper helps you transition your R package's unit tests to the third edition of testthat, namely via:
Converting to snapshot tests
Disentangling nested expectations
Transitioning from deprecated functions like expect_known_*()
The system prompt from a testthat helper includes something like 1,000 tokens. Add in (a generous) 100 tokens for the code that's actually highlighted and also sent off to the model and you're looking at 1,100 input tokens. The model returns approximately the same number of output tokens as it receives, so we'll call that 100 output tokens per refactor.
As of the time of writing (October 2024), the recommended chores model Claude Sonnet 3.5 costs $3 per million input tokens and $15 per million output tokens. So, using the recommended model, testthat helpers cost around $4 for every 1,000 refactored pieces of code. GPT-4o Mini, by contrast, doesn't tend to get many pieces of formatting right and often fails to line-break properly, but does usually return syntactically valid calls to testthat functions, and it would cost around 20 cents per 1,000 refactored pieces of code.
This section includes a handful of examples "from the wild" and are generated with the recommended model, Claude Sonnet 3.5.
Testthat helpers convert expect_error()
(and *_warning()
and *_message()
and *_condition()
) calls to use expect_snapshot()
when there's a
regular expression present:
expect_warning( check_ellipses("exponentiate", "tidy", "boop", exponentiate = TRUE, quick = FALSE), "\\`exponentiate\\` argument is not supported in the \\`tidy\\(\\)\\` method for \\`boop\\` objects" )
Returns:
expect_snapshot( .res <- check_ellipses( "exponentiate", "tidy", "boop", exponentiate = TRUE, quick = FALSE ) )
Note, as well, that intermediate results are assigned to an object so as not to be snapshotted when their contents weren't previously tests.
Another example with multiple, redudant calls:
augment_error <- "augment is only supported for fixest models estimated with feols, feglm, or femlm" expect_error(augment(res_fenegbin, df), augment_error) expect_error(augment(res_feNmlm, df), augment_error) expect_error(augment(res_fepois, df), augment_error)
Returns:
expect_snapshot(error = TRUE, augment(res_fenegbin, df)) expect_snapshot(error = TRUE, augment(res_feNmlm, df)) expect_snapshot(error = TRUE, augment(res_fepois, df))
They know about regexp = NA
, which means "no error" (or warning, or message):
expect_error( p4_b <- check_parameters(w4, p4_a, data = mtcars), regex = NA )
Returns:
expect_no_error(p4_b <- check_parameters(w4, p4_a, data = mtcars))
They also know not to adjust calls to those condition expectations when
there's a class
argument present (which usually means that one is
testing a condition from another package, which should be able to change
the wording of the message without consequence):
expect_error(tidy(pca, matrix = "u"), class = "pca_error")
Returns:
expect_error(tidy(pca, matrix = "u"), class = "pca_error")
When converting non-erroring code, testthat helpers will assign intermediate results so as not to snapshot both the result and the warning:
expect_warning( tidy(fit, robust = TRUE), '"robust" argument has been deprecated' )
Returns:
expect_snapshot( .res <- tidy(fit, robust = TRUE) )
Nested expectations can generally be disentangled without issue:
expect_equal( fit_resamples(decision_tree(cost_complexity = 1), bootstraps(mtcars)), expect_warning(tune_grid(decision_tree(cost_complexity = 1), bootstraps(mtcars))) )
Returns:
expect_snapshot({ fit_resamples_result <- fit_resamples(decision_tree(cost_complexity = 1), bootstraps(mtcars)) tune_grid_result <- tune_grid(decision_tree(cost_complexity = 1), bootstraps(mtcars)) }) expect_equal(fit_resamples_result, tune_grid_result)
There are also a few edits the helper knows to make to third-edition code.
For example, it transitions expect_snapshot_error()
and friends to
use expect_snapshot(error = TRUE)
so that the error context is snapshotted
in addition to the message itself:
expect_snapshot_error( fit_best(knn_pca_res, parameters = tibble(neighbors = 2)) )
Returns:
expect_snapshot( error = TRUE, fit_best(knn_pca_res, parameters = tibble(neighbors = 2)) )
Chore helpers are typically interfaced with via the chores addin. To call the testthat helper directly, use:
helper_testthat <- .init_helper("testthat")
Then, to submit a query, run:
helper_testthat$chat({x})