Title: | Tidy Integration of Large Language Models |
---|---|
Description: | A tidy interface for integrating large language model (LLM) APIs such as 'Claude', 'Openai', 'Groq','Mistral' and local models via 'Ollama' into R workflows. The package supports text and media-based interactions, interactive message history, batch request APIs, and a tidy, pipeline-oriented interface for streamlined integration into data workflows. Web services are available at <https://www.anthropic.com>, <https://openai.com>, <https://groq.com>, <https://mistral.ai/> and <https://ollama.com>. |
Authors: | Eduard Brüll [aut, cre] |
Maintainer: | Eduard Brüll <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.2.0 |
Built: | 2024-11-08 10:31:20 UTC |
Source: | CRAN |
This function sends a message history to the Azure OpenAI Chat Completions API and returns the assistant's reply. This function is work in progress and not fully tested
azure_openai( .llm, .endpoint_url = Sys.getenv("AZURE_ENDPOINT_URL"), .deployment = "gpt-4o-mini", .api_version = "2024-08-01-preview", .max_completion_tokens = NULL, .frequency_penalty = NULL, .logit_bias = NULL, .logprobs = FALSE, .top_logprobs = NULL, .presence_penalty = NULL, .seed = NULL, .stop = NULL, .stream = FALSE, .temperature = NULL, .top_p = NULL, .timeout = 60, .verbose = FALSE, .json = FALSE, .json_schema = NULL, .dry_run = FALSE, .max_tries = 3 )
azure_openai( .llm, .endpoint_url = Sys.getenv("AZURE_ENDPOINT_URL"), .deployment = "gpt-4o-mini", .api_version = "2024-08-01-preview", .max_completion_tokens = NULL, .frequency_penalty = NULL, .logit_bias = NULL, .logprobs = FALSE, .top_logprobs = NULL, .presence_penalty = NULL, .seed = NULL, .stop = NULL, .stream = FALSE, .temperature = NULL, .top_p = NULL, .timeout = 60, .verbose = FALSE, .json = FALSE, .json_schema = NULL, .dry_run = FALSE, .max_tries = 3 )
.llm |
An |
.endpoint_url |
Base URL for the API (default: Sys.getenv("AZURE_ENDPOINT_URL")). |
.deployment |
The identifier of the model that is deployed (default: "gpt-4o-mini"). |
.api_version |
Which version of the API is deployed (default: "2024-08-01-preview") |
.max_completion_tokens |
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. |
.frequency_penalty |
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far. |
.logit_bias |
A named list modifying the likelihood of specified tokens appearing in the completion. |
.logprobs |
Whether to return log probabilities of the output tokens (default: FALSE). |
.top_logprobs |
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position. |
.presence_penalty |
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far. |
.seed |
If specified, the system will make a best effort to sample deterministically. |
.stop |
Up to 4 sequences where the API will stop generating further tokens. |
.stream |
If set to TRUE, the answer will be streamed to console as it comes (default: FALSE). |
.temperature |
What sampling temperature to use, between 0 and 2. Higher values make the output more random. |
.top_p |
An alternative to sampling with temperature, called nucleus sampling. |
.timeout |
Request timeout in seconds (default: 60). |
.verbose |
Should additional information be shown after the API call (default: FALSE). |
.json |
Should output be in JSON mode (default: FALSE). |
.json_schema |
A JSON schema object as R list to enforce the output structure (If defined has precedence over JSON mode). |
.dry_run |
If TRUE, perform a dry run and return the request object (default: FALSE). |
.max_tries |
Maximum retries to perform request |
A new LLMMessage
object containing the original messages plus the assistant's response.
## Not run: # Basic usage msg <- llm_message("What is R programming?") result <- azure_openai(msg) # With custom parameters result2 <- azure_openai(msg, .deployment = "gpt-4o-mini", .temperature = 0.7, .max_tokens = 1000) ## End(Not run)
## Not run: # Basic usage msg <- llm_message("What is R programming?") result <- azure_openai(msg) # With custom parameters result2 <- azure_openai(msg, .deployment = "gpt-4o-mini", .temperature = 0.7, .max_tokens = 1000) ## End(Not run)
Provides a wrapper for the openai()
function to facilitate
migration from the deprecated chatgpt()
function. This ensures backward
compatibility while allowing users to transition to the updated features.
chatgpt( .llm, .model = "gpt-4o", .max_tokens = 1024, .temperature = NULL, .top_p = NULL, .top_k = NULL, .frequency_penalty = NULL, .presence_penalty = NULL, .api_url = "https://api.openai.com/", .timeout = 60, .verbose = FALSE, .json = FALSE, .stream = FALSE, .dry_run = FALSE )
chatgpt( .llm, .model = "gpt-4o", .max_tokens = 1024, .temperature = NULL, .top_p = NULL, .top_k = NULL, .frequency_penalty = NULL, .presence_penalty = NULL, .api_url = "https://api.openai.com/", .timeout = 60, .verbose = FALSE, .json = FALSE, .stream = FALSE, .dry_run = FALSE )
.llm |
An |
.model |
A character string specifying the model to use. |
.max_tokens |
An integer specifying the maximum number of tokens (mapped to |
.temperature |
A numeric value for controlling randomness. This is |
.top_p |
A numeric value for nucleus sampling, indicating the top |
.top_k |
Currently unused, as it is not supported by |
.frequency_penalty |
A numeric value that penalizes new tokens based on their frequency so far. |
.presence_penalty |
A numeric value that penalizes new tokens based on whether they appear in the text so far. |
.api_url |
Character string specifying the API URL. Defaults to the OpenAI API endpoint. |
.timeout |
An integer specifying the request timeout in seconds. This is |
.verbose |
Will print additional information about the request (default: false) |
.json |
Should json-mode be used? (detault: false) |
.stream |
Should the response be processed as a stream (default: false) |
.dry_run |
Should the request is constructed but not actually sent. Useful for debugging and testing. (default: false) |
This function is deprecated and is now a wrapper around openai()
. It is
recommended to switch to using openai()
directly in future code. The
chatgpt()
function remains available to ensure backward compatibility for
existing projects.
An LLMMessage
object with the assistant's reply.
Use openai()
instead.
## Not run: # Using the deprecated chatgpt() function result <- chatgpt(.llm = llm_message(), .prompt = "Hello, how are you?") ## End(Not run)
## Not run: # Using the deprecated chatgpt() function result <- chatgpt(.llm = llm_message(), .prompt = "Hello, how are you?") ## End(Not run)
This function retrieves the processing status and other details of a specified Claude batch ID from the Claude API.
check_claude_batch( .llms = NULL, .batch_id = NULL, .api_url = "https://api.anthropic.com/", .dry_run = FALSE, .max_tries = 3, .timeout = 60 )
check_claude_batch( .llms = NULL, .batch_id = NULL, .api_url = "https://api.anthropic.com/", .dry_run = FALSE, .max_tries = 3, .timeout = 60 )
.llms |
A list of LLMMessage objects |
.batch_id |
A manually set batchid |
.api_url |
Character; base URL of the Claude API (default: "https://api.anthropic.com/"). |
.dry_run |
Logical; if TRUE, returns the prepared request object without executing it (default: FALSE). |
.max_tries |
Maximum retries to peform request |
.timeout |
Integer specifying the request timeout in seconds (default: 60). |
A tibble with information about the status of batch processing
This function retrieves the processing status and other details of a specified OpenAI batch ID from the OpenAI Batch API.
check_openai_batch( .llms = NULL, .batch_id = NULL, .dry_run = FALSE, .max_tries = 3, .timeout = 60 )
check_openai_batch( .llms = NULL, .batch_id = NULL, .dry_run = FALSE, .max_tries = 3, .timeout = 60 )
.llms |
A list of LLMMessage objects. |
.batch_id |
A manually set batch ID. |
.dry_run |
Logical; if TRUE, returns the prepared request object without executing it (default: FALSE). |
.max_tries |
Maximum retries to perform the request (default: 3). |
.timeout |
Integer specifying the request timeout in seconds (default: 60). |
A tibble with information about the status of batch processing.
Interact with Claude AI models via the Anthropic API
claude( .llm, .model = "claude-3-5-sonnet-20241022", .max_tokens = 1024, .temperature = NULL, .top_k = NULL, .top_p = NULL, .metadata = NULL, .stop_sequences = NULL, .tools = NULL, .api_url = "https://api.anthropic.com/", .verbose = FALSE, .max_tries = 3, .timeout = 60, .stream = FALSE, .dry_run = FALSE )
claude( .llm, .model = "claude-3-5-sonnet-20241022", .max_tokens = 1024, .temperature = NULL, .top_k = NULL, .top_p = NULL, .metadata = NULL, .stop_sequences = NULL, .tools = NULL, .api_url = "https://api.anthropic.com/", .verbose = FALSE, .max_tries = 3, .timeout = 60, .stream = FALSE, .dry_run = FALSE )
.llm |
An LLMMessage object containing the conversation history and system prompt. |
.model |
Character string specifying the Claude model version (default: "claude-3-5-sonnet-20241022"). |
.max_tokens |
Integer specifying the maximum number of tokens in the response (default: 1024). |
.temperature |
Numeric between 0 and 1 controlling response randomness. |
.top_k |
Integer controlling diversity by limiting the top K tokens. |
.top_p |
Numeric between 0 and 1 for nucleus sampling. |
.metadata |
List of additional metadata to include with the request. |
.stop_sequences |
Character vector of sequences that will halt response generation. |
.tools |
List of additional tools or functions the model can use. |
.api_url |
Base URL for the Anthropic API (default: "https://api.anthropic.com/"). |
.verbose |
Logical; if TRUE, displays additional information about the API call (default: FALSE). |
.max_tries |
Maximum retries to peform request |
.timeout |
Integer specifying the request timeout in seconds (default: 60). |
.stream |
Logical; if TRUE, streams the response piece by piece (default: FALSE). |
.dry_run |
Logical; if TRUE, returns the prepared request object without executing it (default: FALSE). |
A new LLMMessage object containing the original messages plus Claude's response.
## Not run: # Basic usage msg <- llm_message("What is R programming?") result <- claude(msg) # With custom parameters result2 <- claude(msg, .temperature = 0.7, .max_tokens = 1000) ## End(Not run)
## Not run: # Basic usage msg <- llm_message("What is R programming?") result <- claude(msg) # With custom parameters result2 <- claude(msg, .temperature = 0.7, .max_tokens = 1000) ## End(Not run)
This function takes a data frame and converts it into an LLMMessage object
representing a conversation history. The data frame should contain specific
columns (role
and content
) with each row representing a message in the
conversation.
df_llm_message(.df)
df_llm_message(.df)
.df |
A data frame with at least two rows and columns |
An LLMMessage object containing the structured messages as per the input data frame.
This function retrieves the results of a completed Claude batch and updates
the provided list of LLMMessage
objects with the responses. It aligns each
response with the original request using the custom_id
s generated in send_claude_batch()
.
fetch_claude_batch( .llms, .batch_id = NULL, .api_url = "https://api.anthropic.com/", .dry_run = FALSE, .max_tries = 3, .timeout = 60 )
fetch_claude_batch( .llms, .batch_id = NULL, .api_url = "https://api.anthropic.com/", .dry_run = FALSE, .max_tries = 3, .timeout = 60 )
.llms |
A list of |
.batch_id |
Character; the unique identifier for the batch. By default this is NULL
and the function will attempt to use the |
.api_url |
Character; the base URL for the Claude API (default: "https://api.anthropic.com/"). |
.dry_run |
Logical; if |
.max_tries |
Integer; maximum number of retries if the request fails (default: |
.timeout |
Integer; request timeout in seconds (default: |
A list of updated LLMMessage
objects, each with the assistant's response added if successful.
This function retrieves the results of a completed OpenAI batch and updates
the provided list of LLMMessage
objects with the responses. It aligns each
response with the original request using the custom_id
s generated in send_openai_batch()
.
fetch_openai_batch( .llms, .batch_id = NULL, .dry_run = FALSE, .max_tries = 3, .timeout = 60 )
fetch_openai_batch( .llms, .batch_id = NULL, .dry_run = FALSE, .max_tries = 3, .timeout = 60 )
.llms |
A list of |
.batch_id |
Character; the unique identifier for the batch. By default this is NULL
and the function will attempt to use the |
.dry_run |
Logical; if |
.max_tries |
Integer; maximum number of retries if the request fails (default: |
.timeout |
Integer; request timeout in seconds (default: |
A list of updated LLMMessage
objects, each with the assistant's response added if successful.
This function generates a callback function that processes streaming responses
from different language model APIs. The callback function is specific to the
API provided (claude
, ollama
, "mistral"
, or openai
) and processes incoming data streams,
printing the content to the console and updating a global environment for further use.
generate_callback_function(.api)
generate_callback_function(.api)
.api |
A character string indicating the API type. Supported values are
|
For Claude API: The function processes event and data lines, and handles the message_start
and message_stop
events to control streaming flow.
For Ollama API: The function directly parses the stream content as JSON and extracts the
message$content
field.
For OpenAI, Mistral and Groq: The function handles JSON data streams and processes content deltas.
It stops processing when the [DONE]
message is encountered.
A function that serves as a callback to handle streaming responses
from the specified API. The callback function processes the raw data, updates
the .tidyllm_stream_env$stream
object, and prints the streamed content to the console.
The function returns TRUE
if streaming should continue, and FALSE
when
streaming is finished.
Retrieves the assistant's reply as plain text from an LLMMessage
object at a specified index.
get_reply(.llm, .index = NULL)
get_reply(.llm, .index = NULL)
.llm |
An |
.index |
A positive integer for the assistant reply index to retrieve, defaulting to the last reply. |
Plain text content of the assistant's reply, or NA_character_
if no reply is available.
Retrieves and parses the assistant's reply as JSON from an LLMMessage
object at a specified index.
If the reply is not marked as JSON, attempts to extract JSON content from text.
get_reply_data(.llm, .index = NULL)
get_reply_data(.llm, .index = NULL)
.llm |
An |
.index |
A positive integer for the assistant reply index to retrieve, defaulting to the last reply. |
Parsed data content of the assistant's reply, or NULL
if parsing fails.
Extracts the content of a user's message from an LLMMessage
object at a specific index.
get_user_message(.llm, .index = NULL)
get_user_message(.llm, .index = NULL)
.llm |
A |
.index |
A positive integer indicating which user message to retrieve. Defaults to |
Returns the content of the user's message at the specified index. If no messages are found, returns NULL
.
This function sends a message history to the Groq Chat API and returns the assistant's reply.
groq( .llm, .model = "llama-3.2-11b-vision-preview", .max_tokens = 1024, .temperature = NULL, .top_p = NULL, .frequency_penalty = NULL, .presence_penalty = NULL, .stop = NULL, .seed = NULL, .api_url = "https://api.groq.com/", .json = FALSE, .timeout = 60, .verbose = FALSE, .stream = FALSE, .dry_run = FALSE, .max_tries = 3 )
groq( .llm, .model = "llama-3.2-11b-vision-preview", .max_tokens = 1024, .temperature = NULL, .top_p = NULL, .frequency_penalty = NULL, .presence_penalty = NULL, .stop = NULL, .seed = NULL, .api_url = "https://api.groq.com/", .json = FALSE, .timeout = 60, .verbose = FALSE, .stream = FALSE, .dry_run = FALSE, .max_tries = 3 )
.llm |
An |
.model |
The identifier of the model to use (default: "llama-3.2-11b-vision-preview"). |
.max_tokens |
The maximum number of tokens that can be generated in the response (default: 1024). |
.temperature |
Controls the randomness in the model's response. Values between 0 and 2 are allowed, where higher values increase randomness (optional). |
.top_p |
Nucleus sampling parameter that controls the proportion of probability mass considered. Values between 0 and 1 are allowed (optional). |
.frequency_penalty |
Number between -2.0 and 2.0. Positive values penalize repeated tokens, reducing likelihood of repetition (optional). |
.presence_penalty |
Number between -2.0 and 2.0. Positive values encourage new topics by penalizing tokens that have appeared so far (optional). |
.stop |
One or more sequences where the API will stop generating further tokens. Can be a string or a list of strings (optional). |
.seed |
An integer for deterministic sampling. If specified, attempts to return the same result for repeated requests with identical parameters (optional). |
.api_url |
Base URL for the Groq API (default: "https://api.groq.com/"). |
.json |
Whether the response should be structured as JSON (default: FALSE). |
.timeout |
Request timeout in seconds (default: 60). |
.verbose |
If TRUE, displays additional information after the API call, including rate limit details (default: FALSE). |
.stream |
Logical; if TRUE, streams the response piece by piece (default: FALSE). |
.dry_run |
If TRUE, performs a dry run and returns the constructed request object without executing it (default: FALSE). |
.max_tries |
Maximum retries to peform request |
A new LLMMessage
object containing the original messages plus the assistant's response.
## Not run: # Basic usage msg <- llm_message("What is Groq?") result <- groq(msg) # With custom parameters result2 <- groq(msg, .model = "llama-3.2-vision", .temperature = 0.5, .max_tokens = 512) ## End(Not run)
## Not run: # Basic usage msg <- llm_message("What is Groq?") result <- groq(msg) # With custom parameters result2 <- groq(msg, .model = "llama-3.2-vision", .temperature = 0.5, .max_tokens = 512) ## End(Not run)
This function reads an audio file and sends it to the Groq transcription API for transcription.
groq_transcribe( .audio_file, .model = "whisper-large-v3", .language = NULL, .prompt = NULL, .temperature = 0, .api_url = "https://api.groq.com/openai/v1/audio/transcriptions", .dry_run = FALSE, .verbose = FALSE, .max_tries = 3 )
groq_transcribe( .audio_file, .model = "whisper-large-v3", .language = NULL, .prompt = NULL, .temperature = 0, .api_url = "https://api.groq.com/openai/v1/audio/transcriptions", .dry_run = FALSE, .verbose = FALSE, .max_tries = 3 )
.audio_file |
The path to the audio file (required). Supported formats include flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm. |
.model |
The model to use for transcription (default: "whisper-large-v3"). |
.language |
The language of the input audio, in ISO-639-1 format (optional). |
.prompt |
A prompt to guide the transcription style. It should match the audio language (optional). |
.temperature |
Sampling temperature, between 0 and 1, with higher values producing more randomness (default: 0). |
.api_url |
Base URL for the API (default: "https://api.groq.com/openai/v1/audio/transcriptions"). |
.dry_run |
Logical; if TRUE, performs a dry run and returns the request object without making the API call (default: FALSE). |
.verbose |
Logical; if TRUE, rate limiting info is displayed after the API request (default: FALSE). |
.max_tries |
Maximum retries to peform request |
A character vector containing the transcription.
## Not run: # Basic usage groq_transcribe(.audio_file = "example.mp3") ## End(Not run)
## Not run: # Basic usage groq_transcribe(.audio_file = "example.mp3") ## End(Not run)
This function initializes a named environment for storing rate limit information specific to an API. It ensures that each API's rate limit data is stored separately.
initialize_api_env(.api_name)
initialize_api_env(.api_name)
.api_name |
The name of the API for which to initialize or retrieve the environment |
A wrapper around get_reply()
to retrieve the most recent assistant text reply.
last_reply(.llm)
last_reply(.llm)
.llm |
A |
Returns the content of the assistant's reply at the specified index, based on the following conditions:
A wrapper around get_reply_data()
to retrieve structured data from the most recent assistant reply.
last_reply_data(.llm)
last_reply_data(.llm)
.llm |
A |
Returns the content of the assistant's reply at the specified index, based on the following conditions:
A wrapper around get_user_message()
to retrieve the most recent user message.
last_user_message(.llm)
last_user_message(.llm)
.llm |
A |
The content of the last user message.
Retrieves batch request details from the Claude API.
list_claude_batches( .api_url = "https://api.anthropic.com/", .limit = 20, .max_tries = 3, .timeout = 60 )
list_claude_batches( .api_url = "https://api.anthropic.com/", .limit = 20, .max_tries = 3, .timeout = 60 )
.api_url |
Base URL for the Claude API (default: "https://api.anthropic.com/"). |
.limit |
Maximum number of batches to retrieve (default: 20). |
.max_tries |
Maximum retry attempts for requests (default: 3). |
.timeout |
Request timeout in seconds (default: 60). |
A tibble with batch details: batch ID, status, creation time, expiration time, and request counts (succeeded, errored, expired, canceled).
Retrieves batch request details from the OpenAI Batch API.
list_openai_batches(.limit = 20, .max_tries = 3, .timeout = 60)
list_openai_batches(.limit = 20, .max_tries = 3, .timeout = 60)
.limit |
Maximum number of batches to retrieve (default: 20). |
.max_tries |
Maximum retry attempts for requests (default: 3). |
.timeout |
Request timeout in seconds (default: 60). |
A tibble with batch details: batch ID, status, creation time, expiration time, and request counts (total, completed, failed).
This function allows the creation of a new LLMMessage object or the updating of an existing one. It can handle the addition of text prompts and various media types such as images, PDFs, text files, or plots. The function includes input validation to ensure that all provided parameters are in the correct format.
llm_message( .llm = NULL, .prompt = NULL, .role = "user", .system_prompt = "You are a helpful assistant", .imagefile = NULL, .pdf = NULL, .textfile = NULL, .capture_plot = FALSE, .f = NULL )
llm_message( .llm = NULL, .prompt = NULL, .role = "user", .system_prompt = "You are a helpful assistant", .imagefile = NULL, .pdf = NULL, .textfile = NULL, .capture_plot = FALSE, .f = NULL )
.llm |
An existing LLMMessage object or an initial text prompt. |
.prompt |
Text prompt to add to the message history. |
.role |
The role of the message sender, typically "user" or "assistant". |
.system_prompt |
Default system prompt if a new LLMMessage needs to be created. |
.imagefile |
Path to an image file to be attached (optional). |
.pdf |
Path to a PDF file to be attached (optional). Can be a character vector of length one (file path), or a list with |
.textfile |
Path to a text file to be read and attached (optional). |
.capture_plot |
Boolean to indicate whether a plot should be captured and attached as an image (optional). |
.f |
An R function or an object coercible to a function via |
Returns an updated or new LLMMessage object.
Large Language Model Message Class
Large Language Model Message Class
This class manages a history of messages and media interactions intended for use with large language models. It allows for adding messages, converting messages for API usage, and printing the history in a structured format.
message_history
List to store all message interactions.
system_prompt
The system prompt used for a conversation
new()
Initializes the LLMMessage object with an optional system prompt.
LLMMessage$new(system_prompt = "You are a helpful assistant")
system_prompt
A string that sets the initial system prompt.
A new LLMMessage object. Deep Clone of LLMMessage Object
This method creates a deep copy of the LLMMessage
object. It ensures that
all internal states, including message histories and settings, are copied
so that the original object remains unchanged when mutations are applied
to the copy. This is particularly useful for maintaining immutability in
a tidyverse-like functional programming context where functions should
not have side effects on their inputs.
clone_deep()
LLMMessage$clone_deep()
A new LLMMessage
object that is a deep copy of the original.
Add a message
Adds a message to the history. Optionally includes media.
add_message()
LLMMessage$add_message(role, content, media = NULL, json = FALSE)
role
The role of the message sender (e.g., "user", "assistant").
content
The textual content of the message.
media
Optional; media content to attach to the message.
json
Is the message a raw string that contains a json response? Convert to API format
Converts the message history to a format suitable for various API calls.
to_api_format()
LLMMessage$to_api_format( api_type, cgpt_image_detail = "auto", no_system = FALSE )
api_type
The type of API (e.g., "claude","groq","openai").
cgpt_image_detail
Specific option for ChatGPT API (imagedetail - set to auto)
no_system
Without system prompt (default: FALSE)
A message history in the target API format Simple helper function to determine whether the message history contains an image We check this function whenever we call models that do not support images so we can post a warning to the user that images were found but not sent to the model
has_image()
LLMMessage$has_image()
Returns TRUE if the message hisotry contains images Remove a Message by Index
Removes a message from the message history at the specified index.
remove_message()
LLMMessage$remove_message(index)
index
A positive integer indicating the position of the message to remove.
The LLMMessage
object, invisibly.
print()
Prints the current message history in a structured format.
LLMMessage$print()
clone()
The objects of this class are cloneable with this method.
LLMMessage$clone(deep = FALSE)
deep
Whether to make a deep clone.
Send LLMMessage to Mistral API
mistral( .llm, .model = "mistral-large-latest", .stream = FALSE, .seed = NULL, .json = FALSE, .temperature = 0.7, .top_p = 1, .stop = NULL, .safe_prompt = FALSE, .timeout = 120, .max_tries = 3, .max_tokens = 1024, .min_tokens = NULL, .dry_run = FALSE, .verbose = FALSE )
mistral( .llm, .model = "mistral-large-latest", .stream = FALSE, .seed = NULL, .json = FALSE, .temperature = 0.7, .top_p = 1, .stop = NULL, .safe_prompt = FALSE, .timeout = 120, .max_tries = 3, .max_tokens = 1024, .min_tokens = NULL, .dry_run = FALSE, .verbose = FALSE )
.llm |
An |
.model |
The model identifier to use (default: |
.stream |
Whether to stream back partial progress to the console. (default: |
.seed |
The seed to use for random sampling. If set, different calls will generate deterministic results (optional). |
.json |
Whether the output should be in JSON mode(default: |
.temperature |
Sampling temperature to use, between |
.top_p |
Nucleus sampling parameter, between |
.stop |
Stop generation if this token is detected, or if one of these tokens is detected when providing a list (optional). |
.safe_prompt |
Whether to inject a safety prompt before all conversations (default: |
.timeout |
When should our connection time out in seconds (default: |
.max_tries |
Maximum retries to peform request |
.max_tokens |
The maximum number of tokens to generate in the completion. Must be |
.min_tokens |
The minimum number of tokens to generate in the completion. Must be |
.dry_run |
If |
.verbose |
Should additional information be shown after the API call? (default: |
Returns an updated LLMMessage
object.
Generate Embeddings Using Mistral API
mistral_embedding( .llm, .model = "mistral-embed", .timeout = 120, .max_tries = 3, .dry_run = FALSE )
mistral_embedding( .llm, .model = "mistral-embed", .timeout = 120, .max_tries = 3, .dry_run = FALSE )
.llm |
An existing LLMMessage object (or a character vector of texts to embed) |
.model |
The embedding model identifier (default: "mistral-embed"). |
.timeout |
Timeout for the API request in seconds (default: 120). |
.max_tries |
Maximum retries to peform request |
.dry_run |
If TRUE, perform a dry run and return the request object. |
A matrix where each column corresponds to the embedding of a message in the message history.
Interact with local AI models via the Ollama API
ollama( .llm, .model = "gemma2", .stream = FALSE, .seed = NULL, .json = FALSE, .temperature = NULL, .num_ctx = 2048, .num_predict = NULL, .top_k = NULL, .top_p = NULL, .min_p = NULL, .mirostat = NULL, .mirostat_eta = NULL, .mirostat_tau = NULL, .repeat_last_n = NULL, .repeat_penalty = NULL, .tfs_z = NULL, .stop = NULL, .ollama_server = "http://localhost:11434", .timeout = 120, .keep_alive = NULL, .dry_run = FALSE )
ollama( .llm, .model = "gemma2", .stream = FALSE, .seed = NULL, .json = FALSE, .temperature = NULL, .num_ctx = 2048, .num_predict = NULL, .top_k = NULL, .top_p = NULL, .min_p = NULL, .mirostat = NULL, .mirostat_eta = NULL, .mirostat_tau = NULL, .repeat_last_n = NULL, .repeat_penalty = NULL, .tfs_z = NULL, .stop = NULL, .ollama_server = "http://localhost:11434", .timeout = 120, .keep_alive = NULL, .dry_run = FALSE )
.llm |
An LLMMessage object containing the conversation history and system prompt. |
.model |
Character string specifying the Ollama model to use (default: "gemma2") |
.stream |
Logical; whether to stream the response (default: FALSE) |
.seed |
Integer; seed for reproducible generation (default: NULL) |
.json |
Logical; whether to format response as JSON (default: FALSE) |
.temperature |
Float between 0-2; controls randomness in responses (default: NULL) |
.num_ctx |
Integer; sets the context window size (default: 2048) |
.num_predict |
Integer; maximum number of tokens to predict (default: NULL) |
.top_k |
Integer; controls diversity by limiting top tokens considered (default: NULL) |
.top_p |
Float between 0-1; nucleus sampling threshold (default: NULL) |
.min_p |
Float between 0-1; minimum probability threshold (default: NULL) |
.mirostat |
Integer (0,1,2); enables Mirostat sampling algorithm (default: NULL) |
.mirostat_eta |
Float; Mirostat learning rate (default: NULL) |
.mirostat_tau |
Float; Mirostat target entropy (default: NULL) |
.repeat_last_n |
Integer; tokens to look back for repetition (default: NULL) |
.repeat_penalty |
Float; penalty for repeated tokens (default: NULL) |
.tfs_z |
Float; tail free sampling parameter (default: NULL) |
.stop |
Character; custom stop sequence(s) (default: NULL) |
.ollama_server |
String; Ollama API endpoint (default: "http://localhost:11434") |
.timeout |
Integer; API request timeout in seconds (default: 120) |
.keep_alive |
Character; How long should the ollama model be kept in memory after request (default: NULL - 5 Minutes) |
.dry_run |
Logical; if TRUE, returns request object without execution (default: FALSE) |
The function provides extensive control over the generation process through various parameters:
Temperature (0-2): Higher values increase creativity, lower values make responses more focused
Top-k/Top-p: Control diversity of generated text
Mirostat: Advanced sampling algorithm for maintaining consistent complexity
Repeat penalties: Prevent repetitive text
Context window: Control how much previous conversation is considered
A new LLMMessage object containing the original messages plus the model's response
## Not run: llm_message("user", "Hello, how are you?") response <- ollama(llm, .model = "gemma2", .temperature = 0.7) # With custom parameters response <- ollama( llm, .model = "llama2", .temperature = 0.8, .top_p = 0.9, .num_ctx = 4096 ) ## End(Not run)
## Not run: llm_message("user", "Hello, how are you?") response <- ollama(llm, .model = "gemma2", .temperature = 0.7) # With custom parameters response <- ollama( llm, .model = "llama2", .temperature = 0.8, .top_p = 0.9, .num_ctx = 4096 ) ## End(Not run)
This function sends a request to the Ollama API to download a specified model. It can operate in a streaming mode where it provides live updates of the download status and progress, or a single response mode.
ollama_download_model(.model, .ollama_server = "http://localhost:11434")
ollama_download_model(.model, .ollama_server = "http://localhost:11434")
.model |
The name of the model to download. |
.ollama_server |
The base URL of the Ollama API (default is "http://localhost:11434"). |
Generate Embeddings Using Ollama API
ollama_embedding( .llm, .model = "all-minilm", .truncate = TRUE, .ollama_server = "http://localhost:11434", .timeout = 120, .dry_run = FALSE )
ollama_embedding( .llm, .model = "all-minilm", .truncate = TRUE, .ollama_server = "http://localhost:11434", .timeout = 120, .dry_run = FALSE )
.llm |
An existing LLMMessage object (or a charachter vector of texts to embed) |
.model |
The embedding model identifier (default: "all-minilm"). |
.truncate |
Whether to truncate inputs to fit the model's context length (default: TRUE). |
.ollama_server |
The URL of the Ollama server to be used (default: "http://localhost:11434"). |
.timeout |
Timeout for the API request in seconds (default: 120). |
.dry_run |
If TRUE, perform a dry run and return the request object. |
A matrix where each column corresponds to the embedding of a message in the message history.
This function connects to the Ollama API and retrieves information about available models, returning it as a tibble.
ollama_list_models(.ollama_server = "http://localhost:11434")
ollama_list_models(.ollama_server = "http://localhost:11434")
.ollama_server |
The URL of the ollama server to be used |
A tibble containing model information, or NULL if no models are found.
This function sends a message history to the OpenAI Chat Completions API and returns the assistant's reply.
openai( .llm, .model = "gpt-4o", .max_completion_tokens = NULL, .frequency_penalty = NULL, .logit_bias = NULL, .logprobs = FALSE, .top_logprobs = NULL, .presence_penalty = NULL, .seed = NULL, .stop = NULL, .stream = FALSE, .temperature = NULL, .top_p = NULL, .api_url = "https://api.openai.com/", .timeout = 60, .verbose = FALSE, .json = FALSE, .json_schema = NULL, .max_tries = 3, .dry_run = FALSE, .compatible = FALSE, .api_path = "/v1/chat/completions" )
openai( .llm, .model = "gpt-4o", .max_completion_tokens = NULL, .frequency_penalty = NULL, .logit_bias = NULL, .logprobs = FALSE, .top_logprobs = NULL, .presence_penalty = NULL, .seed = NULL, .stop = NULL, .stream = FALSE, .temperature = NULL, .top_p = NULL, .api_url = "https://api.openai.com/", .timeout = 60, .verbose = FALSE, .json = FALSE, .json_schema = NULL, .max_tries = 3, .dry_run = FALSE, .compatible = FALSE, .api_path = "/v1/chat/completions" )
.llm |
An |
.model |
The identifier of the model to use (default: "gpt-4o"). |
.max_completion_tokens |
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. |
.frequency_penalty |
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far. |
.logit_bias |
A named list modifying the likelihood of specified tokens appearing in the completion. |
.logprobs |
Whether to return log probabilities of the output tokens (default: FALSE). |
.top_logprobs |
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position. |
.presence_penalty |
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far. |
.seed |
If specified, the system will make a best effort to sample deterministically. |
.stop |
Up to 4 sequences where the API will stop generating further tokens. |
.stream |
If set to TRUE, the answer will be streamed to console as it comes (default: FALSE). |
.temperature |
What sampling temperature to use, between 0 and 2. Higher values make the output more random. |
.top_p |
An alternative to sampling with temperature, called nucleus sampling. |
.api_url |
Base URL for the API (default: "https://api.openai.com/"). |
.timeout |
Request timeout in seconds (default: 60). |
.verbose |
Should additional information be shown after the API call (default: FALSE). |
.json |
Should output be in JSON mode (default: FALSE). |
.json_schema |
A JSON schema object as R list to enforce the output structure (If defined has precedence over JSON mode). |
.max_tries |
Maximum retries to perform request |
.dry_run |
If TRUE, perform a dry run and return the request object (default: FALSE). |
.compatible |
If TRUE, skip API and rate-limit checks for OpenAI compatible APIs (default: FALSE). |
.api_path |
The path relative to the base |
A new LLMMessage
object containing the original messages plus the assistant's response.
Generate Embeddings Using OpenAI API
openai_embedding( .llm, .model = "text-embedding-3-small", .truncate = TRUE, .timeout = 120, .dry_run = FALSE, .max_tries = 3 )
openai_embedding( .llm, .model = "text-embedding-3-small", .truncate = TRUE, .timeout = 120, .dry_run = FALSE, .max_tries = 3 )
.llm |
An existing LLMMessage object (or a character vector of texts to embed) |
.model |
The embedding model identifier (default: "text-embedding-3-small"). |
.truncate |
Whether to truncate inputs to fit the model's context length (default: TRUE). |
.timeout |
Timeout for the API request in seconds (default: 120). |
.dry_run |
If TRUE, perform a dry run and return the request object. |
.max_tries |
Maximum retry attempts for requests (default: 3). |
A matrix where each column corresponds to the embedding of a message in the message history.
This internal function parses duration strings as returned by the OpenAI API
parse_duration_to_seconds(.duration_str)
parse_duration_to_seconds(.duration_str)
.duration_str |
A duration string. |
A numeric number of seconds
This function processes a PDF file page by page. For each page, it extracts the text and converts the page into an image. It creates a list of LLMMessage objects with the text and the image for multimodal processing. Users can specify a range of pages to process and provide a custom function to generate prompts for each page.
pdf_page_batch( .pdf, .general_prompt, .system_prompt = "You are a helpful assistant", .page_range = NULL, .prompt_fn = NULL )
pdf_page_batch( .pdf, .general_prompt, .system_prompt = "You are a helpful assistant", .page_range = NULL, .prompt_fn = NULL )
.pdf |
Path to the PDF file. |
.general_prompt |
A default prompt that is applied to each page if |
.system_prompt |
Optional system prompt to initialize the LLMMessage (default is "You are a helpful assistant"). |
.page_range |
A vector of two integers specifying the start and end pages to process. If NULL, all pages are processed. |
.prompt_fn |
An optional custom function that generates a prompt for each page. The function takes the page text as input
and returns a string. If NULL, |
A list of LLMMessage objects, each containing the text and image for a page.
Perform an API request to interact with language models
perform_api_request( .request, .api, .stream = FALSE, .timeout = 60, .max_tries = 3, .parse_response_fn = NULL, .dry_run = FALSE )
perform_api_request( .request, .api, .stream = FALSE, .timeout = 60, .max_tries = 3, .parse_response_fn = NULL, .dry_run = FALSE )
.request |
The httr2 request object. |
.api |
The API identifier (e.g., "claude", "openai"). |
.stream |
Stream the response if TRUE. |
.timeout |
Request timeout in seconds. |
.max_tries |
Maximum retry attempts for requests (default: 3). |
.parse_response_fn |
A function to parse the assistant's reply. |
.dry_run |
If TRUE, perform a dry run and return the request object. |
A list containing the assistant's reply and response headers.
This function retrieves the rate limit details for the specified API, or for all APIs stored in the .tidyllm_rate_limit_env if no API is specified.
rate_limit_info(.api_name = NULL)
rate_limit_info(.api_name = NULL)
.api_name |
(Optional) The name of the API whose rate limit info you want to get If not provided, the rate limit info for all APIs in the environment will be returned |
A tibble containing the rate limit information.
Extract rate limit information from API response headers
ratelimit_from_header(.response_headers, .api)
ratelimit_from_header(.response_headers, .api)
.response_headers |
Headers from the API response |
.api |
The API type ("claude", "openai","groq") |
A list containing rate limit information
This function creates and submits a batch of messages to the Claude API for asynchronous processing.
send_claude_batch( .llms, .model = "claude-3-5-sonnet-20241022", .max_tokens = 1024, .temperature = NULL, .top_k = NULL, .top_p = NULL, .stop_sequences = NULL, .api_url = "https://api.anthropic.com/", .verbose = FALSE, .dry_run = FALSE, .overwrite = FALSE, .max_tries = 3, .timeout = 60, .id_prefix = "tidyllm_claude_req_" )
send_claude_batch( .llms, .model = "claude-3-5-sonnet-20241022", .max_tokens = 1024, .temperature = NULL, .top_k = NULL, .top_p = NULL, .stop_sequences = NULL, .api_url = "https://api.anthropic.com/", .verbose = FALSE, .dry_run = FALSE, .overwrite = FALSE, .max_tries = 3, .timeout = 60, .id_prefix = "tidyllm_claude_req_" )
.llms |
A list of LLMMessage objects containing conversation histories. |
.model |
Character string specifying the Claude model version (default: "claude-3-5-sonnet-20241022"). |
.max_tokens |
Integer specifying the maximum tokens per response (default: 1024). |
.temperature |
Numeric between 0 and 1 controlling response randomness. |
.top_k |
Integer for diversity by limiting the top K tokens. |
.top_p |
Numeric between 0 and 1 for nucleus sampling. |
.stop_sequences |
Character vector of sequences that halt response generation. |
.api_url |
Base URL for the Claude API (default: "https://api.anthropic.com/"). |
.verbose |
Logical; if TRUE, prints a message with the batch ID (default: FALSE). |
.dry_run |
Logical; if TRUE, returns the prepared request object without executing it (default: FALSE). |
.overwrite |
Logical; if TRUE, allows overwriting an existing batch ID associated with the request (default: FALSE). |
.max_tries |
Maximum number of retries to perform the request. |
.timeout |
Integer specifying the request timeout in seconds (default: 60). |
.id_prefix |
Character string to specify a prefix for generating custom IDs when names in |
An updated and named list of .llms
with identifiers that align with batch responses, including a batch_id
attribute.
This function creates and submits a batch of messages to the OpenAI Batch API for asynchronous processing.
send_openai_batch( .llms, .model = "gpt-4o", .max_completion_tokens = NULL, .frequency_penalty = NULL, .logit_bias = NULL, .logprobs = FALSE, .top_logprobs = NULL, .presence_penalty = NULL, .seed = NULL, .stop = NULL, .temperature = NULL, .top_p = NULL, .dry_run = FALSE, .overwrite = FALSE, .json_schema = NULL, .max_tries = 3, .timeout = 60, .verbose = FALSE, .id_prefix = "tidyllm_openai_req_" )
send_openai_batch( .llms, .model = "gpt-4o", .max_completion_tokens = NULL, .frequency_penalty = NULL, .logit_bias = NULL, .logprobs = FALSE, .top_logprobs = NULL, .presence_penalty = NULL, .seed = NULL, .stop = NULL, .temperature = NULL, .top_p = NULL, .dry_run = FALSE, .overwrite = FALSE, .json_schema = NULL, .max_tries = 3, .timeout = 60, .verbose = FALSE, .id_prefix = "tidyllm_openai_req_" )
.llms |
A list of LLMMessage objects containing conversation histories. |
.model |
Character string specifying the OpenAI model version (default: "gpt-4o"). |
.max_completion_tokens |
Integer specifying the maximum tokens per response (default: NULL). |
.frequency_penalty |
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far. |
.logit_bias |
A named list modifying the likelihood of specified tokens appearing in the completion. |
.logprobs |
Whether to return log probabilities of the output tokens (default: FALSE). |
.top_logprobs |
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position. |
.presence_penalty |
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far. |
.seed |
If specified, the system will make a best effort to sample deterministically. |
.stop |
Up to 4 sequences where the API will stop generating further tokens. |
.temperature |
What sampling temperature to use, between 0 and 2. Higher values make the output more random. |
.top_p |
An alternative to sampling with temperature, called nucleus sampling. |
.dry_run |
Logical; if TRUE, returns the prepared request object without executing it (default: FALSE). |
.overwrite |
Logical; if TRUE, allows overwriting an existing batch ID associated with the request (default: FALSE). |
.json_schema |
A JSON schema object as R list to enforce the output structure (default: NULL). |
.max_tries |
Maximum number of retries to perform the request (default: 3). |
.timeout |
Integer specifying the request timeout in seconds (default: 60). |
.verbose |
Logical; if TRUE, additional info about the requests is printed (default: FALSE). |
.id_prefix |
Character string to specify a prefix for generating custom IDs when names in |
An updated and named list of .llms
with identifiers that align with batch responses, including a batch_id
attribute.
This function creates a JSON schema suitable for use with the API functions in tidyllm.
tidyllm_schema(name, ...)
tidyllm_schema(name, ...)
name |
A character vector specifying the schema name. This serves as an identifier for the schema. |
... |
Named arguments where each name represents a field in the schema and each value specifies the type. Supported types include R data types:
|
The tidyllm_schema() function is designed to make defining JSON schemas for tidyllm more concise and user-friendly. It maps R-like types to JSON schema types and validates inputs to enforce tidy data principles. Nested structures are not allowed to maintain compatibility with tidy data conventions.
A list representing the JSON schema with the specified fields and types, suitable for passing to openai()'s .json_schema parameter.
Factor types (factor(...)) are treated as enumerations in JSON and are limited to a set of allowable string values. Arrays of a given type can be specified by appending [] to the type.
## Not run: # Define a schema with tidy data principles json_schema <- tidyllm_schema( name = "DocumentAnalysisSchema", Title = "character", Authors = "character[]", SuggestedFilename = "character", Type = "factor(Policy, Research)", Answer_Q1 = "character", Answer_Q2 = "character", Answer_Q3 = "character", Answer_Q4 = "character", KeyCitations = "character[]" ) # Pass the schema to openai() result <- openai( .llm = msg, .json_schema = json_schema ) ## End(Not run)
## Not run: # Define a schema with tidy data principles json_schema <- tidyllm_schema( name = "DocumentAnalysisSchema", Title = "character", Authors = "character[]", SuggestedFilename = "character", Type = "factor(Policy, Research)", Answer_Q1 = "character", Answer_Q2 = "character", Answer_Q3 = "character", Answer_Q4 = "character", KeyCitations = "character[]" ) # Pass the schema to openai() result <- openai( .llm = msg, .json_schema = json_schema ) ## End(Not run)
This function initializes stores ratelimit information from API functions for future use
update_rate_limit(.api_name, .response_object)
update_rate_limit(.api_name, .response_object)
.api_name |
The name of the API for which to initialize or retrieve the environment. |
.response_object |
A preparsed response object cotaining info on remaining requests, tokens and rest times |