Sentiment analysis in R with a LLM and ‘tidyprompt’

This vignette will show you how to use the tidyprompt package to perform a sentiment analysis in R with a LLM (large language model).

We will first define a simple dataset of sentences that we will create sentiment scores for.

sentences_df <- data.frame(
  sentence = c(
    "I love this product!",
    "This product is terrible",
    "The customer service was excellent",
    "I am very disappointed with this product",
    "The delivery was fast and efficient",
    "I would not recommend this product to anyone",
    "It was not bad, not great either",
    "Meh",
    "It felt like walking up a mountain",
    "I am angry!!!"
  )
)

First we will create a connection to a locally running LLM (using Ollama)

library(tidyprompt)

ollama <- llm_provider_ollama()
ollama$parameters$model <- "llama3.1:8b"

For every sentence, we will now prompt the LLM to provide a sentiment score. To do this, we will use ‘purrr’ map() to iterate over each sentence. Within each iteration, we use a tidyprompt() to prompt the LLM for the sentiment score. The latter will force the LLM to answer as an integer and extract the integer from its response.

library(purrr)

sentences_df$sentiment_score <- map_int(
  sentences_df$sentence,
  function(sentence) {
    paste0(
      "Please provide a sentiment score for the following sentence:\n\n",
      sentence
    ) |>
      answer_as_integer(min = 1, max = 100) |>
      send_prompt(ollama, verbose = FALSE)
  }
)

sentences_df
#>                                        sentence sentiment_score
#> 1                          I love this product!              95
#> 2                      This product is terrible               3
#> 3            The customer service was excellent              95
#> 4      I am very disappointed with this product              20
#> 5           The delivery was fast and efficient              95
#> 6  I would not recommend this product to anyone               3
#> 7              It was not bad, not great either              46
#> 8                                           Meh              50
#> 9            It felt like walking up a mountain              25
#> 10                                I am angry!!!              10

Let’s plot the results!

library(ggplot2)

ggplot(sentences_df, aes(x = sentiment_score, y = reorder(sentence, sentiment_score))) +
  geom_col(aes(fill = sentiment_score)) +
  scale_fill_gradient(low = "red", high = "green") +
  theme_minimal() +
  labs(
    title = "Sentiment scores for each sentence",
    x = "Sentiment score",
    y = "Sentence"
  )

Plot of sentiment scores