Title: | Interpreting Latent Variables with AI |
---|---|
Description: | A small package designed for interpreting continuous and categorical latent variables. You provide a data set with a latent variable you want to understand and some other explanatory variables. It provides a description of the latent variable based on the explanatory variables. It also provides a name to the latent variable. |
Authors: | Nel Hervé [aut], Sébastien Lê [aut, cre] |
Maintainer: | Sébastien Lê <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.2.1 |
Built: | 2024-12-12 06:45:41 UTC |
Source: | CRAN |
These data were collected after a Q-method-like survey on students' expectations of agribusiness studies. Participants had to rank how much they agreed with 38 statements about possible benefits from agribusiness studies; then, they were asked personal questions.
agri_studies
agri_studies
A data frame with 53 rows (participants) and 42 columns (questions):
columns 1-38: statements about agribusiness studies
columns 39-42: personal information
Juliette LE COLLONNIER and Lou ROBERT, students at l'Institut Agro Rennes-Angers
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) data(agri_studies) res_mca_agri <- FactoMineR::MCA(agri_studies, quali.sup = 39:42, level.ventil = 0.05, graph = FALSE) agri_work <- res_mca_agri$ind$coord |> as.data.frame() agri_work <- agri_work[,1] |> cbind(agri_studies) intro_agri <- "These data were collected after a survey on students' expectations of agribusiness studies. Participants had to rank how much they agreed with 38 statements about possible benefits from agribusiness studies; then, they were asked personal questions." intro_agri <- gsub('\n', ' ', intro_agri) |> stringr::str_squish() res_agri <- nail_condes(agri_work, num.var = 1, introduction = intro_agri) cat(res_agri$response) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) data(agri_studies) res_mca_agri <- FactoMineR::MCA(agri_studies, quali.sup = 39:42, level.ventil = 0.05, graph = FALSE) agri_work <- res_mca_agri$ind$coord |> as.data.frame() agri_work <- agri_work[,1] |> cbind(agri_studies) intro_agri <- "These data were collected after a survey on students' expectations of agribusiness studies. Participants had to rank how much they agreed with 38 statements about possible benefits from agribusiness studies; then, they were asked personal questions." intro_agri <- gsub('\n', ' ', intro_agri) |> stringr::str_squish() res_agri <- nail_condes(agri_work, num.var = 1, introduction = intro_agri) cat(res_agri$response) ## End(Not run)
People think they need to make big changes to change the course of their lives. But in James Clear's book, Atomic Habits, they will discover that the smallest of changes, coupled with a good knowledge of psychology and neuroscience, can have a revolutionary effect on their lives and relationships. To understand this concept of atomic habits, we interviewed 167 people and asked them if they were able to never take their car alone again, to buy local products... We also asked them how restrictive they found this and why.
atomic_habit
atomic_habit
A data frame with 167 rows and 50 columns:
columns 1-10, do you feel able to...
columns 11-20, from 0 to 5 how restrictive...
columns 21-30, is it restrictive, yes or no...
columns 31-40, justify your answers
columns 41-50, a combination of able and restrictive
Applied mathematics department, Institut Agro Rennes-Angers
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(FactoMineR) library(NaileR) data(atomic_habit) res_mfa <- MFA(atomic_habit[,1:30], group = c(10,10,10), type = c("n","s","n"), num.group.sup = 3, name.group = c("capable","restrictive", "restrictive binary"), graph = FALSE) plot.MFA(res_mfa, choix = "ind", invisible = c("quali","quali.sup"), lab.ind = FALSE, title = "MFA based on being capable and restrictiveness data") res_hcpc <- HCPC(res_mfa, nb.clust = 3, graph = FALSE) plot.HCPC(res_hcpc, choice = "map", draw.tree = FALSE, ind.names = FALSE, title = "Atomic habits - typology") summary(res_hcpc$data.clust) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(FactoMineR) library(NaileR) data(atomic_habit) res_mfa <- MFA(atomic_habit[,1:30], group = c(10,10,10), type = c("n","s","n"), num.group.sup = 3, name.group = c("capable","restrictive", "restrictive binary"), graph = FALSE) plot.MFA(res_mfa, choix = "ind", invisible = c("quali","quali.sup"), lab.ind = FALSE, title = "MFA based on being capable and restrictiveness data") res_hcpc <- HCPC(res_mfa, nb.clust = 3, graph = FALSE) plot.HCPC(res_hcpc, choice = "map", draw.tree = FALSE, ind.names = FALSE, title = "Atomic habits - typology") summary(res_hcpc$data.clust) ## End(Not run)
People think they need to make big changes to change the course of their lives. But in James Clear's book, Atomic Habits, they will discover that the smallest of changes, coupled with a good knowledge of psychology and neuroscience, can have a revolutionary effect on their lives and relationships. To understand this concept of atomic habits, we interviewed 167 people and asked them if they were able to never take their car alone again, to buy local products... We also asked them how restrictive they found this and why.
atomic_habit_clust
atomic_habit_clust
A data frame with 167 rows and 51 columns:
columns 1-10, do you feel able to...
columns 11-20, from 0 to 5 how restrictive...
columns 21-30, is it restrictive, yes or no...
columns 31-40, justify your answers
columns 41-50, a combination of able and restrictive
column 51, cluster variable based on MFA (20 first variables)
Applied mathematics department, Institut Agro Rennes-Angers
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(FactoMineR) library(NaileR) data(atomic_habit_clust) catdes(atomic_habit_clust, num.var = 51) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(FactoMineR) library(NaileR) data(atomic_habit_clust) catdes(atomic_habit_clust, num.var = 51) ## End(Not run)
These data refer to 8 types of beards. Each beard was evaluated by 62 assessors (except beard 8 which only had 60 evaluations).
beard
beard
A data frame with 494 rows and 2 columns:
the types of beards;
the words used to describe them.
Applied mathematics department, Institut Agro Rennes-Angers
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. data(beard) beard[1:8,] ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. data(beard) beard[1:8,] ## End(Not run)
These data refer to 8 types of beards. Each beard was evaluated by 62 assessors (except beard 8 which only had 60 evaluations).
beard_cont
beard_cont
A contingency table (data frame) with 8 rows and 337 columns:
rows are the types of beards;
columns are the words used at least once to describe them.
Applied mathematics department, Institut Agro Rennes-Angers
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) data(beard_cont) FactoMineR::descfreq(beard_cont) intro_beard <- 'A survey was conducted about beards and 8 types of beards were described. In the data that follow, beards are named B1 to B8.' intro_beard <- gsub('\n', ' ', intro_beard) |> stringr::str_squish() req_beard <- 'Please give a name to each beard and summarize what makes this beard unique.' req_beard <- gsub('\n', ' ', req_beard) |> stringr::str_squish() res_beard <- nail_descfreq(beard_cont, introduction = intro_beard, request = req_beard) cat(res_beard$response) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) data(beard_cont) FactoMineR::descfreq(beard_cont) intro_beard <- 'A survey was conducted about beards and 8 types of beards were described. In the data that follow, beards are named B1 to B8.' intro_beard <- gsub('\n', ' ', intro_beard) |> stringr::str_squish() req_beard <- 'Please give a name to each beard and summarize what makes this beard unique.' req_beard <- gsub('\n', ' ', req_beard) |> stringr::str_squish() res_beard <- nail_descfreq(beard_cont, introduction = intro_beard, request = req_beard) cat(res_beard$response) ## End(Not run)
These data refer to 8 types of beards. They come from a subset of the original "beard" dataset. Each beard was evaluated by 62 assessors (except beard 8 which only had 60 evaluations).
beard_wide
beard_wide
A data frame with 8 rows and 24 columns:
rows are the types of beards;
columns are the assessors' opinions.
Applied mathematics department, Institut Agro Rennes-Angers
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) data(beard_wide) intro_beard <- "As a barber, you make recommendations based on consumers comments. Examples of consumers descriptions of beards are as follows." intro_beard <- gsub('\n', ' ', intro_beard) |> stringr::str_squish() res <- nail_sort(beard_wide[,1:5], name_size = 3, stimulus_id = "beard", introduction = intro_beard, measure = 'the description was') res$dta_sort cat(res$prompt_llm[[1]]) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) data(beard_wide) intro_beard <- "As a barber, you make recommendations based on consumers comments. Examples of consumers descriptions of beards are as follows." intro_beard <- gsub('\n', ' ', intro_beard) |> stringr::str_squish() res <- nail_sort(beard_wide[,1:5], name_size = 3, stimulus_id = "beard", introduction = intro_beard, measure = 'the description was') res$dta_sort cat(res$prompt_llm[[1]]) ## End(Not run)
These data were collected after a Q-method-like survey on participants' perception of an "ideal boss". Participants had to rank how much they agreed with 30 statements about an ideal boss; then, they were asked personal questions.
boss
boss
A data frame with 73 rows (participants) and 39 columns (questions):
columns 1-30: statements about the ideal boss
columns 31-39: personal information
Florian LECLERE and Marianne ANDRE, students at l'Institut Agro Rennes-Angers
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(FactoMineR) library(NaileR) data(boss) res_mca_boss <- MCA(boss, quali.sup = 31:39, ncp = 30, level.ventil = 0.05, graph = FALSE) res_hcpc_boss <- HCPC(res_mca_boss, nb.clust = 4, graph = FALSE) don_clust_boss <- res_hcpc_boss$data.clust intro_boss <- 'A study on "the ideal boss" was led on 73 participants. The study had 2 parts. In the first part, participants were given statements about the ideal boss (starting with "My ideal boss..."). They had to rate, on a scale from 1 to 5, how much they agreed with the statements; 1 being "Strongly disagree", 3 being "neutral" and 5 being "Strongly agree". In the second part, they were asked for personal information: work experience, age, etc. Participants were then split into groups based on their answers.' intro_boss <- gsub('\n', ' ', intro_boss) |> stringr::str_squish() req_boss <- "Please describe, for each group, their ideal boss. Then, give each group a new name, based on your conclusions." req_boss <- gsub('\n', ' ', req_boss) |> stringr::str_squish() res_boss <- nail_catdes(don_clust_boss, num.var = 40, introduction = intro_boss, request = req_boss, isolate.groups = FALSE, drop.negative = TRUE) res_boss$response |> cat() ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(FactoMineR) library(NaileR) data(boss) res_mca_boss <- MCA(boss, quali.sup = 31:39, ncp = 30, level.ventil = 0.05, graph = FALSE) res_hcpc_boss <- HCPC(res_mca_boss, nb.clust = 4, graph = FALSE) don_clust_boss <- res_hcpc_boss$data.clust intro_boss <- 'A study on "the ideal boss" was led on 73 participants. The study had 2 parts. In the first part, participants were given statements about the ideal boss (starting with "My ideal boss..."). They had to rate, on a scale from 1 to 5, how much they agreed with the statements; 1 being "Strongly disagree", 3 being "neutral" and 5 being "Strongly agree". In the second part, they were asked for personal information: work experience, age, etc. Participants were then split into groups based on their answers.' intro_boss <- gsub('\n', ' ', intro_boss) |> stringr::str_squish() req_boss <- "Please describe, for each group, their ideal boss. Then, give each group a new name, based on your conclusions." req_boss <- gsub('\n', ' ', req_boss) |> stringr::str_squish() res_boss <- nail_catdes(don_clust_boss, num.var = 40, introduction = intro_boss, request = req_boss, isolate.groups = FALSE, drop.negative = TRUE) res_boss$response |> cat() ## End(Not run)
People think they need to make big changes to change the course of their lives. But in James Clear's book, Atomic Habits, they will discover that the smallest of changes, coupled with a good knowledge of psychology and neuroscience, can have a revolutionary effect on their lives and relationships. To understand this concept of atomic habits, we interviewed 167 people and asked them if they were able to never take their car alone again, to buy local products... We also asked them how restrictive they found this and why.
car_alone
car_alone
column 1, a combination of being able and feeling restrictive
column 2, justify your answer
Applied mathematics department, Institut Agro Rennes-Angers
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(FactoMineR) library(NaileR) library(dplyr) data(car_alone) sampled_car_alone <- car_alone %>% group_by(car_alone_capable_restrictive) %>% sample_frac(0.5) sampled_car_alone <- as.data.frame(sampled_car_alone) intro_car <- "Knowing the impact on the climate, I have made these choices based on the following benefits and constraints..." intro_car <- gsub('\n', ' ', intro_car) |> stringr::str_squish() res_nail_textual <- nail_textual(sampled_car_alone, num.var = 1, num.text = 2, introduction = intro_car, request = NULL, model = 'llama3', isolate.groups = TRUE, generate = TRUE) res_nail_textual[[1]]$response |> cat() res_nail_textual[[3]]$response |> cat() res_nail_textual[[2]]$response |> cat() res_nail_textual[[4]]$response |> cat() ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(FactoMineR) library(NaileR) library(dplyr) data(car_alone) sampled_car_alone <- car_alone %>% group_by(car_alone_capable_restrictive) %>% sample_frac(0.5) sampled_car_alone <- as.data.frame(sampled_car_alone) intro_car <- "Knowing the impact on the climate, I have made these choices based on the following benefits and constraints..." intro_car <- gsub('\n', ' ', intro_car) |> stringr::str_squish() res_nail_textual <- nail_textual(sampled_car_alone, num.var = 1, num.text = 2, introduction = intro_car, request = NULL, model = 'llama3', isolate.groups = TRUE, generate = TRUE) res_nail_textual[[1]]$response |> cat() res_nail_textual[[3]]$response |> cat() res_nail_textual[[2]]$response |> cat() res_nail_textual[[4]]$response |> cat() ## End(Not run)
Compute a distance matrix between randomly-generated responses to an LLM prompt.
dist_mat_llm(ppt, n, per_miss = 0)
dist_mat_llm(ppt, n, per_miss = 0)
ppt |
an LLM prompt. |
n |
the number of responses to be generated. |
per_miss |
the proportion of missing values in the final matrix (between 0 and 1; 0 by default). |
The final percentage of missing values might differ from the per_miss parameter value; rather than a percentage of values being turned to NA, each value has a per_miss probability of being NA.
A list containing:
a list of the LLM results for each iteration;
a distance matrix.
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. data(iris) intro_iris <- "A study measured various parts of iris flowers from 3 different species: setosa, versicolor and virginica. I will give you the results from this study. You will have to identify what sets these flowers apart." intro_iris <- gsub('\n', ' ', intro_iris) |> stringr::str_squish() req_iris <- "Please explain what makes each species distinct. Also, tell me which species has the biggest flowers, and which species has the smallest." req_iris <- gsub('\n', ' ', req_iris) |> stringr::str_squish() res_iris <- nail_catdes(iris, num.var = 5, introduction = intro_iris, request = req_iris) dist_mat_llm(res_iris$prompt, n = 5, per_miss = 0) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. data(iris) intro_iris <- "A study measured various parts of iris flowers from 3 different species: setosa, versicolor and virginica. I will give you the results from this study. You will have to identify what sets these flowers apart." intro_iris <- gsub('\n', ' ', intro_iris) |> stringr::str_squish() req_iris <- "Please explain what makes each species distinct. Also, tell me which species has the biggest flowers, and which species has the smallest." req_iris <- gsub('\n', ' ', req_iris) |> stringr::str_squish() res_iris <- nail_catdes(iris, num.var = 5, introduction = intro_iris, request = req_iris) dist_mat_llm(res_iris$prompt, n = 5, per_miss = 0) ## End(Not run)
Compute distances between an LLM response of interest and some other responses to the same prompt.
dist_ref_llm(ppt, ref, n)
dist_ref_llm(ppt, ref, n)
ppt |
an LLM prompt. |
ref |
the reference response. |
n |
the number of new responses to be generated. |
A list containing:
a list with the newly-generated prompts;
a vector of distances to the reference response.
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. data(iris) intro_iris <- "A study measured various parts of iris flowers from 3 different species: setosa, versicolor and virginica. I will give you the results from this study. You will have to identify what sets these flowers apart." intro_iris <- gsub('\n', ' ', intro_iris) |> stringr::str_squish() req_iris <- "Please explain what makes each species distinct. Also, tell me which species has the biggest flowers, and which species has the smallest." req_iris <- gsub('\n', ' ', req_iris) |> stringr::str_squish() res_iris <- nail_catdes(iris, num.var = 5, introduction = intro_iris, request = req_iris) dist_ref_llm(res_iris$prompt, res_iris$response, n = 5) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. data(iris) intro_iris <- "A study measured various parts of iris flowers from 3 different species: setosa, versicolor and virginica. I will give you the results from this study. You will have to identify what sets these flowers apart." intro_iris <- gsub('\n', ' ', intro_iris) |> stringr::str_squish() req_iris <- "Please explain what makes each species distinct. Also, tell me which species has the biggest flowers, and which species has the smallest." req_iris <- gsub('\n', ' ', req_iris) |> stringr::str_squish() res_iris <- nail_catdes(iris, num.var = 5, introduction = intro_iris, request = req_iris) dist_ref_llm(res_iris$prompt, res_iris$response, n = 5) ## End(Not run)
This dataset was initially collected to understand the free jar data.
fabric
fabric
A data frame with 567 rows and 4 columns:
The ID of the judge
The product
The reason why the product was liked or disliked
O if the product was disliked, 1 otherwise
Applied mathematics department, Institut Agro Rennes-Angers
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) data(fabric) intro_car <- "For this consumer study, a car seat fabric was evaluated by consumers. Some of them didn't like it (group '0'), others liked it (group '1'). The consumers gave their reasons for disliking or liking the fabric." intro_car <- gsub('\n', ' ', intro_car) |> stringr::str_squish() request_car <- "Based on the comments provided by the consumers, please explain the reasons why the fabric was not appreciated for group '0', and the reasons why the fabric was appreciated for group '1'. In other words, what are the drivers for disliking and liking this fabric." request_car <- gsub('\n', ' ', request_car) |> stringr::str_squish() fabric_A <- droplevels(fabric[fabric$Fabric=="A",]) res_nail_textual_fabric <- nail_textual(fabric_A, num.var = 4, num.text = 3, introduction = intro_car, request = request_car, model = 'llama3', isolate.groups = FALSE, generate = FALSE) cat(res_nail_textual_fabric$prompt) res_nail_textual_fabric <- nail_textual(fabric_A, num.var = 4, num.text = 3, introduction = intro_car, request = request_car, model = 'llama3', isolate.groups = FALSE, generate = TRUE) cat(res_nail_textual_fabric$response) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) data(fabric) intro_car <- "For this consumer study, a car seat fabric was evaluated by consumers. Some of them didn't like it (group '0'), others liked it (group '1'). The consumers gave their reasons for disliking or liking the fabric." intro_car <- gsub('\n', ' ', intro_car) |> stringr::str_squish() request_car <- "Based on the comments provided by the consumers, please explain the reasons why the fabric was not appreciated for group '0', and the reasons why the fabric was appreciated for group '1'. In other words, what are the drivers for disliking and liking this fabric." request_car <- gsub('\n', ' ', request_car) |> stringr::str_squish() fabric_A <- droplevels(fabric[fabric$Fabric=="A",]) res_nail_textual_fabric <- nail_textual(fabric_A, num.var = 4, num.text = 3, introduction = intro_car, request = request_car, model = 'llama3', isolate.groups = FALSE, generate = FALSE) cat(res_nail_textual_fabric$prompt) res_nail_textual_fabric <- nail_textual(fabric_A, num.var = 4, num.text = 3, introduction = intro_car, request = request_car, model = 'llama3', isolate.groups = FALSE, generate = TRUE) cat(res_nail_textual_fabric$response) ## End(Not run)
These data were collected after a Q-method-like survey on participants' feelings about speaking in public. Participants had to rank how much they agreed with 25 descriptions of speaking in public; then, they were asked personal questions.
glossophobia
glossophobia
A data frame with 139 rows (participants) and 41 columns (questions):
columns 1-25: descriptions of speaking in public
columns 26-41: personal information
Elina BIAU and Théo LEDAIN, students at l'Institut Agro Rennes-Angers
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) data(glossophobia) res_mca_phobia <- FactoMineR::MCA(glossophobia, quali.sup = 26:41, level.ventil = 0.05, graph = FALSE) phobia_work <- res_mca_phobia$ind$coord |> as.data.frame() phobia_work <- phobia_work[,1] |> cbind(glossophobia) intro_phobia <- "These data were collected after a survey on participants' feelings about speaking in public. Participants had to rank how much they agreed with 25 descriptions of speaking in public; then, they were asked personal questions." intro_phobia <- gsub('\n', ' ', intro_phobia) |> stringr::str_squish() res_phobia <- nail_condes(phobia_work, num.var = 1, introduction = intro_phobia) cat(res_phobia$response) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) data(glossophobia) res_mca_phobia <- FactoMineR::MCA(glossophobia, quali.sup = 26:41, level.ventil = 0.05, graph = FALSE) phobia_work <- res_mca_phobia$ind$coord |> as.data.frame() phobia_work <- phobia_work[,1] |> cbind(glossophobia) intro_phobia <- "These data were collected after a survey on participants' feelings about speaking in public. Participants had to rank how much they agreed with 25 descriptions of speaking in public; then, they were asked personal questions." intro_phobia <- gsub('\n', ' ', intro_phobia) |> stringr::str_squish() res_phobia <- nail_condes(phobia_work, num.var = 1, introduction = intro_phobia) cat(res_phobia$response) ## End(Not run)
These data were collected after a Q-method-like survey on sustainable food systems. Participants had to rank how acceptable they found 45 statements about a sustainable food system; then, they were asked if they agreed with 14 other statements.
local_food
local_food
A data frame with 573 rows (participants) and 63 columns (questions):
columns 1-45 statements about food systems
columns 46-59 opinions
columns 60-63 personal information
Applied mathematics department, Institut Agro Rennes-Angers
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(FactoMineR) library(NaileR) data(local_food) res_mca_food <- MCA(local_food, quali.sup = 46:63, ncp = 100, level.ventil = 0.05, graph = FALSE) res_hcpc_food <- HCPC(res_mca_food, nb.clust = 3, graph = FALSE) don_clust_food <- res_hcpc_food$data.clust intro_food <- 'A study on sustainable food systems was led on several French participants. This study had 2 parts. In the first part, participants had to rate how acceptable "a food system that..." (e.g, "a food system that only uses renewable energy") was to them. In the second part, they had to say if they agreed or disagreed with some statements.' intro_food <- gsub('\n', ' ', intro_food) |> stringr::str_squish() req_food <- 'I will give you the answers from one group. Please explain who the individuals of this group are, what their beliefs are. Then, give this group a new name, and explain why you chose this name. Do not use 1st person ("I", "my"...) in your answer.' req_food <- gsub('\n', ' ', req_food) |> stringr::str_squish() res_food <- nail_catdes(don_clust_food, num.var = 64, introduction = intro_food, request = req_food, isolate.groups = TRUE, drop.negative = TRUE) res_food[[1]]$response |> cat() ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(FactoMineR) library(NaileR) data(local_food) res_mca_food <- MCA(local_food, quali.sup = 46:63, ncp = 100, level.ventil = 0.05, graph = FALSE) res_hcpc_food <- HCPC(res_mca_food, nb.clust = 3, graph = FALSE) don_clust_food <- res_hcpc_food$data.clust intro_food <- 'A study on sustainable food systems was led on several French participants. This study had 2 parts. In the first part, participants had to rate how acceptable "a food system that..." (e.g, "a food system that only uses renewable energy") was to them. In the second part, they had to say if they agreed or disagreed with some statements.' intro_food <- gsub('\n', ' ', intro_food) |> stringr::str_squish() req_food <- 'I will give you the answers from one group. Please explain who the individuals of this group are, what their beliefs are. Then, give this group a new name, and explain why you chose this name. Do not use 1st person ("I", "my"...) in your answer.' req_food <- gsub('\n', ' ', req_food) |> stringr::str_squish() res_food <- nail_catdes(don_clust_food, num.var = 64, introduction = intro_food, request = req_food, isolate.groups = TRUE, drop.negative = TRUE) res_food[[1]]$response |> cat() ## End(Not run)
Generate an LLM response to analyze a categorical latent variable.
nail_catdes( dataset, num.var, introduction = NULL, request = NULL, model = "llama3", isolate.groups = FALSE, drop.negative = FALSE, proba = 0.05, row.w = NULL, generate = FALSE )
nail_catdes( dataset, num.var, introduction = NULL, request = NULL, model = "llama3", isolate.groups = FALSE, drop.negative = FALSE, proba = 0.05, row.w = NULL, generate = FALSE )
dataset |
a data frame made up of at least one categorical variable and a set of quantitative variables and/or categorical variables. |
num.var |
the index of the variable to be characterized. |
introduction |
the introduction for the LLM prompt. |
request |
the request made to the LLM. |
model |
the model name ('llama3' by default). |
isolate.groups |
a boolean that indicates whether to give the LLM a single prompt, or one prompt per category. Recommended with long catdes results. |
drop.negative |
a boolean that indicates whether to drop negative v.test values for interpretation (keeping only positive v.tests). Recommended with long catdes results. |
proba |
the significance threshold considered to characterize the categories (by default 0.05). |
row.w |
a vector of integers corresponding to an optional row weights (by default, a vector of 1 for uniform row weights) |
generate |
a boolean that indicates whether to generate the LLM response. If FALSE, the function only returns the prompt. |
This function directly sends a prompt to an LLM. Therefore, to get a consistent answer, we highly recommend to customize the parameters introduction and request and add all relevant information on your data for the LLM. We also recommend renaming the columns with clear, unshortened and unambiguous names.
Additionally, if isolate.groups = TRUE, you will need an introduction and a request that take into account the fact that only one group is analyzed at a time.
A data frame, or a list of data frames, containing the LLM's prompt and response (if generate = TRUE).
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. ### Example 1: Fisher's iris ### library(NaileR) data(iris) intro_iris <- "A study measured various parts of iris flowers from 3 different species: setosa, versicolor and virginica. I will give you the results from this study. You will have to identify what sets these flowers apart." intro_iris <- gsub('\n', ' ', intro_iris) |> stringr::str_squish() req_iris <- "Please explain what makes each species distinct. Also, tell me which species has the biggest flowers, and which species has the smallest." req_iris <- gsub('\n', ' ', req_iris) |> stringr::str_squish() res_iris <- nail_catdes(iris, num.var = 5, introduction = intro_iris, request = req_iris, generate = TRUE) cat(res_iris$response) ### Example 2: food waste dataset ### library(FactoMineR) data(waste) waste <- waste[-14] # no variability on this question set.seed(1) res_mca_waste <- MCA(waste, quali.sup = c(1,2,50:76), ncp = 35, level.ventil = 0.05, graph = FALSE) plot.MCA(res_mca_waste, choix = "ind", invisible = c("var", "quali.sup"), label = "none") res_hcpc_waste <- HCPC(res_mca_waste, nb.clust = 3, graph = FALSE) plot.HCPC(res_hcpc_waste, choice = "map", draw.tree = FALSE, ind.names = FALSE) don_clust_waste <- res_hcpc_waste$data.clust intro_waste <- 'These data were collected after a survey on food waste, with participants describing their habits.' intro_waste <- gsub('\n', ' ', intro_waste) |> stringr::str_squish() req_waste <- 'Please summarize the characteristics of each group. Then, give each group a new name, based on your conclusions. Finally, give each group a grade between 0 and 10, based on how wasteful they are with food: 0 being "not at all", 10 being "absolutely".' req_waste <- gsub('\n', ' ', req_waste) |> stringr::str_squish() res_waste <- nail_catdes(don_clust_waste, num.var = ncol(don_clust_waste), introduction = intro_waste, request = req_waste, drop.negative = TRUE, generate = TRUE) cat(res_waste$response) ### Example 3: local_food dataset ### data(local_food) set.seed(1) res_mca_food <- MCA(local_food, quali.sup = 46:63, ncp = 100, level.ventil = 0.05, graph = FALSE) plot.MCA(res_mca_food, choix = "ind", invisible = c("var", "quali.sup"), label = "none") res_hcpc_food <- HCPC(res_mca_food, nb.clust = 3, graph = FALSE) plot.HCPC(res_hcpc_food, choice = "map", draw.tree = FALSE, ind.names = FALSE) don_clust_food <- res_hcpc_food$data.clust intro_food <- 'A study on sustainable food systems was led on several French participants. This study had 2 parts. In the first part, participants had to rate how acceptable "a food system that..." (e.g, "a food system that only uses renewable energy") was to them. In the second part, they had to say if they agreed or disagreed with some statements.' intro_food <- gsub('\n', ' ', intro_food) |> stringr::str_squish() req_food <- 'I will give you the answers from one group. Please explain who the individuals of this group are, what their beliefs are. Then, give this group a new name, and explain why you chose this name. Do not use 1st person ("I", "my"...) in your answer.' req_food <- gsub('\n', ' ', req_food) |> stringr::str_squish() res_food <- nail_catdes(don_clust_food, num.var = 64, introduction = intro_food, request = req_food, isolate.groups = TRUE, drop.negative = TRUE, generate = TRUE) res_food[[1]]$response |> cat() res_food[[2]]$response |> cat() res_food[[3]]$response |> cat() ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. ### Example 1: Fisher's iris ### library(NaileR) data(iris) intro_iris <- "A study measured various parts of iris flowers from 3 different species: setosa, versicolor and virginica. I will give you the results from this study. You will have to identify what sets these flowers apart." intro_iris <- gsub('\n', ' ', intro_iris) |> stringr::str_squish() req_iris <- "Please explain what makes each species distinct. Also, tell me which species has the biggest flowers, and which species has the smallest." req_iris <- gsub('\n', ' ', req_iris) |> stringr::str_squish() res_iris <- nail_catdes(iris, num.var = 5, introduction = intro_iris, request = req_iris, generate = TRUE) cat(res_iris$response) ### Example 2: food waste dataset ### library(FactoMineR) data(waste) waste <- waste[-14] # no variability on this question set.seed(1) res_mca_waste <- MCA(waste, quali.sup = c(1,2,50:76), ncp = 35, level.ventil = 0.05, graph = FALSE) plot.MCA(res_mca_waste, choix = "ind", invisible = c("var", "quali.sup"), label = "none") res_hcpc_waste <- HCPC(res_mca_waste, nb.clust = 3, graph = FALSE) plot.HCPC(res_hcpc_waste, choice = "map", draw.tree = FALSE, ind.names = FALSE) don_clust_waste <- res_hcpc_waste$data.clust intro_waste <- 'These data were collected after a survey on food waste, with participants describing their habits.' intro_waste <- gsub('\n', ' ', intro_waste) |> stringr::str_squish() req_waste <- 'Please summarize the characteristics of each group. Then, give each group a new name, based on your conclusions. Finally, give each group a grade between 0 and 10, based on how wasteful they are with food: 0 being "not at all", 10 being "absolutely".' req_waste <- gsub('\n', ' ', req_waste) |> stringr::str_squish() res_waste <- nail_catdes(don_clust_waste, num.var = ncol(don_clust_waste), introduction = intro_waste, request = req_waste, drop.negative = TRUE, generate = TRUE) cat(res_waste$response) ### Example 3: local_food dataset ### data(local_food) set.seed(1) res_mca_food <- MCA(local_food, quali.sup = 46:63, ncp = 100, level.ventil = 0.05, graph = FALSE) plot.MCA(res_mca_food, choix = "ind", invisible = c("var", "quali.sup"), label = "none") res_hcpc_food <- HCPC(res_mca_food, nb.clust = 3, graph = FALSE) plot.HCPC(res_hcpc_food, choice = "map", draw.tree = FALSE, ind.names = FALSE) don_clust_food <- res_hcpc_food$data.clust intro_food <- 'A study on sustainable food systems was led on several French participants. This study had 2 parts. In the first part, participants had to rate how acceptable "a food system that..." (e.g, "a food system that only uses renewable energy") was to them. In the second part, they had to say if they agreed or disagreed with some statements.' intro_food <- gsub('\n', ' ', intro_food) |> stringr::str_squish() req_food <- 'I will give you the answers from one group. Please explain who the individuals of this group are, what their beliefs are. Then, give this group a new name, and explain why you chose this name. Do not use 1st person ("I", "my"...) in your answer.' req_food <- gsub('\n', ' ', req_food) |> stringr::str_squish() res_food <- nail_catdes(don_clust_food, num.var = 64, introduction = intro_food, request = req_food, isolate.groups = TRUE, drop.negative = TRUE, generate = TRUE) res_food[[1]]$response |> cat() res_food[[2]]$response |> cat() res_food[[3]]$response |> cat() ## End(Not run)
Generate an LLM response to analyze a continuous latent variable.
nail_condes( dataset, num.var, introduction = NULL, request = NULL, model = "llama3", quanti.threshold = 0, quanti.cat = c("Significantly above average", "Significantly below average", "Average"), weights = NULL, proba = 0.05, generate = FALSE )
nail_condes( dataset, num.var, introduction = NULL, request = NULL, model = "llama3", quanti.threshold = 0, quanti.cat = c("Significantly above average", "Significantly below average", "Average"), weights = NULL, proba = 0.05, generate = FALSE )
dataset |
a data frame made up of at least one quantitative variable and a set of quantitative variables and/or categorical variables. |
num.var |
the index of the variable to be characterized. |
introduction |
the introduction for the LLM prompt. |
request |
the request made to the LLM. |
model |
the model name ('llama3' by default). |
quanti.threshold |
the threshold above (resp. below) which a scaled variable is considered significantly above (resp.below) the average. Used when converting continuous variables to categorical ones. |
quanti.cat |
a vector of the 3 possible categories for continuous variables converted to categorical ones according to the threshold. Default is "above average", "below average" and "average". |
weights |
weights for the individuals (see |
proba |
the significance threshold considered to characterize the category (by default 0.05). |
generate |
a boolean that indicates whether to generate the LLM response. If FALSE, the function only returns the prompt. |
This function directly sends a prompt to an LLM. Therefore, to get a consistent answer, we highly recommend to customize the parameters introduction and request and add all relevant information on your data for the LLM. We also recommend renaming the columns with clear, unshortened and unambiguous names.
A data frame containing the LLM's prompt and response (if generate = TRUE).
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. ### Example 1: decathlon dataset ### library(FactoMineR) data(decathlon) names(decathlon) <- c('Time taken to complete the 100m', 'Distance reached for the long jump', 'Distance reached for the shot put', 'Height reached for the high jump', 'Time taken to complete the 400m', 'Time taken to complete the 110m hurdle', 'Distance reached for the discus', 'Height reached for the pole vault', 'Distance reached for the javeline', 'Time taken to complete the 1500 m', 'Rank/Counter-performance indicator', 'Points', 'Competition') res_pca_deca <- FactoMineR::PCA(decathlon, quanti.sup = 11:12, quali.sup = 13, graph = FALSE) plot.PCA(res_pca_deca, choix = 'var') deca_work <- res_pca_deca$ind$coord |> as.data.frame() deca_work <- deca_work[,1] |> cbind(decathlon) intro_deca <- "A study was led on athletes participating in a decathlon event. Their performance was assessed on each part of the decathlon, and they were all placed on an unidimensional scale." intro_deca <- gsub('\n', ' ', intro_deca) |> stringr::str_squish() res_deca <- nail_condes(deca_work, num.var = 1, quanti.threshold = 1, quanti.cat = c('High', 'Low', 'Average'), introduction = intro_deca, generate = TRUE) cat(res_deca$response) ### Example 2: agri_studies dataset ### data(agri_studies) set.seed(1) res_mca_agri <- FactoMineR::MCA(agri_studies, quali.sup = 39:42, level.ventil = 0.05, graph = FALSE) plot.MCA(res_mca_agri, choix = 'ind', invisible = c('var', 'quali.sup'), label = 'none') agri_work <- res_mca_agri$ind$coord |> as.data.frame() agri_work <- agri_work[,1] |> cbind(agri_studies) intro_agri <- "These data were collected after a survey on students' expectations of agribusiness studies. Participants had to rank how much they agreed with 38 statements about possible benefits from agribusiness studies; then, they were asked personal questions." intro_agri <- gsub('\n', ' ', intro_agri) |> stringr::str_squish() res_agri <- nail_condes(agri_work, num.var = 1, introduction = intro_agri, generate = TRUE) cat(res_agri$response) ### Example 3: glossophobia dataset ### data(glossophobia) set.seed(1) res_mca_phobia <- FactoMineR::MCA(glossophobia, quali.sup = 26:41, level.ventil = 0.05, graph = FALSE) plot.MCA(res_mca_phobia, choix = 'ind', invisible = c('var', 'quali.sup'), label = 'none') phobia_work <- res_mca_phobia$ind$coord |> as.data.frame() phobia_work <- phobia_work[,1] |> cbind(glossophobia) intro_phobia <- "These data were collected after a survey on participants' feelings about speaking in public. Participants had to rank how much they agreed with 25 descriptions of speaking in public; then, they were asked personal questions." intro_phobia <- gsub('\n', ' ', intro_phobia) |> stringr::str_squish() res_phobia <- nail_condes(phobia_work, num.var = 1, introduction = intro_phobia, generate = TRUE) cat(res_phobia$response) ### Example 4: beard_cont dataset ### data(beard_cont) set.seed(1) res_ca_beard <- FactoMineR::CA(beard_cont, graph = FALSE) plot.CA(res_ca_beard, invisible = 'col') beard_work <- res_ca_beard$row$coord |> as.data.frame() beard_work <- beard_work[,1] |> cbind(beard_cont) intro_beard <- "These data refer to 8 types of beards. Each beard was evaluated by 62 assessors." intro_beard <- gsub('\n', ' ', intro_beard) |> stringr::str_squish() req_beard <- "Please explain what differentiates beards on both sides of the scale. Then, give the scale a name." req_beard <- gsub('\n', ' ', req_beard) |> stringr::str_squish() res_beard <- nail_condes(beard_work, num.var = 1, quanti.threshold = 0.5, quanti.cat = c('Very often used', 'Never used', 'Sometimes used'), introduction = intro_beard, request = req_beard) res_beard ppt <- stringr::str_replace_all(res_beard, 'observations', 'beards') cat(ppt) res_beard <- ollamar::generate(model = 'llama3', prompt = ppt, output = 'text') cat(res_beard) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. ### Example 1: decathlon dataset ### library(FactoMineR) data(decathlon) names(decathlon) <- c('Time taken to complete the 100m', 'Distance reached for the long jump', 'Distance reached for the shot put', 'Height reached for the high jump', 'Time taken to complete the 400m', 'Time taken to complete the 110m hurdle', 'Distance reached for the discus', 'Height reached for the pole vault', 'Distance reached for the javeline', 'Time taken to complete the 1500 m', 'Rank/Counter-performance indicator', 'Points', 'Competition') res_pca_deca <- FactoMineR::PCA(decathlon, quanti.sup = 11:12, quali.sup = 13, graph = FALSE) plot.PCA(res_pca_deca, choix = 'var') deca_work <- res_pca_deca$ind$coord |> as.data.frame() deca_work <- deca_work[,1] |> cbind(decathlon) intro_deca <- "A study was led on athletes participating in a decathlon event. Their performance was assessed on each part of the decathlon, and they were all placed on an unidimensional scale." intro_deca <- gsub('\n', ' ', intro_deca) |> stringr::str_squish() res_deca <- nail_condes(deca_work, num.var = 1, quanti.threshold = 1, quanti.cat = c('High', 'Low', 'Average'), introduction = intro_deca, generate = TRUE) cat(res_deca$response) ### Example 2: agri_studies dataset ### data(agri_studies) set.seed(1) res_mca_agri <- FactoMineR::MCA(agri_studies, quali.sup = 39:42, level.ventil = 0.05, graph = FALSE) plot.MCA(res_mca_agri, choix = 'ind', invisible = c('var', 'quali.sup'), label = 'none') agri_work <- res_mca_agri$ind$coord |> as.data.frame() agri_work <- agri_work[,1] |> cbind(agri_studies) intro_agri <- "These data were collected after a survey on students' expectations of agribusiness studies. Participants had to rank how much they agreed with 38 statements about possible benefits from agribusiness studies; then, they were asked personal questions." intro_agri <- gsub('\n', ' ', intro_agri) |> stringr::str_squish() res_agri <- nail_condes(agri_work, num.var = 1, introduction = intro_agri, generate = TRUE) cat(res_agri$response) ### Example 3: glossophobia dataset ### data(glossophobia) set.seed(1) res_mca_phobia <- FactoMineR::MCA(glossophobia, quali.sup = 26:41, level.ventil = 0.05, graph = FALSE) plot.MCA(res_mca_phobia, choix = 'ind', invisible = c('var', 'quali.sup'), label = 'none') phobia_work <- res_mca_phobia$ind$coord |> as.data.frame() phobia_work <- phobia_work[,1] |> cbind(glossophobia) intro_phobia <- "These data were collected after a survey on participants' feelings about speaking in public. Participants had to rank how much they agreed with 25 descriptions of speaking in public; then, they were asked personal questions." intro_phobia <- gsub('\n', ' ', intro_phobia) |> stringr::str_squish() res_phobia <- nail_condes(phobia_work, num.var = 1, introduction = intro_phobia, generate = TRUE) cat(res_phobia$response) ### Example 4: beard_cont dataset ### data(beard_cont) set.seed(1) res_ca_beard <- FactoMineR::CA(beard_cont, graph = FALSE) plot.CA(res_ca_beard, invisible = 'col') beard_work <- res_ca_beard$row$coord |> as.data.frame() beard_work <- beard_work[,1] |> cbind(beard_cont) intro_beard <- "These data refer to 8 types of beards. Each beard was evaluated by 62 assessors." intro_beard <- gsub('\n', ' ', intro_beard) |> stringr::str_squish() req_beard <- "Please explain what differentiates beards on both sides of the scale. Then, give the scale a name." req_beard <- gsub('\n', ' ', req_beard) |> stringr::str_squish() res_beard <- nail_condes(beard_work, num.var = 1, quanti.threshold = 0.5, quanti.cat = c('Very often used', 'Never used', 'Sometimes used'), introduction = intro_beard, request = req_beard) res_beard ppt <- stringr::str_replace_all(res_beard, 'observations', 'beards') cat(ppt) res_beard <- ollamar::generate(model = 'llama3', prompt = ppt, output = 'text') cat(res_beard) ## End(Not run)
Describes the rows of a contingency table. For each row, this description is based on the columns of the contingency table that are significantly related to it.
nail_descfreq( dataset, introduction = NULL, request = NULL, model = "llama3", isolate.groups = FALSE, by.quali = NULL, proba = 0.05, generate = FALSE )
nail_descfreq( dataset, introduction = NULL, request = NULL, model = "llama3", isolate.groups = FALSE, by.quali = NULL, proba = 0.05, generate = FALSE )
dataset |
a data frame corresponding to a contingency table. |
introduction |
the introduction for the LLM prompt. |
request |
the request made to the LLM. |
model |
the model name ('llama3' by default). |
isolate.groups |
a boolean that indicates whether to give the LLM a single prompt, or one prompt per row. Recommended if the contingency table has a great number of rows. |
by.quali |
a factor used to merge the data from different rows of the contingency table; by default NULL and each row is characterized. |
proba |
the significance threshold considered to characterize the category (by default 0.05). |
generate |
a boolean that indicates whether to generate the LLM response. If FALSE, the function only returns the prompt. |
This function directly sends a prompt to an LLM. Therefore, to get a consistent answer, we highly recommend to customize the parameters introduction and request and add all relevant information on your data for the LLM.
Additionally, if isolate.groups = TRUE, you will need an introduction and a request that take into account the fact that only one group is analyzed at a time.
A data frame, or a list of data frames, containing the LLM's prompt and response (if generate = TRUE).
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. ### Example 1: beard dataset ### data(beard_cont) intro_beard_iso <- 'A survey was conducted about beards and 8 types of beards were described. I will give you the results for one type of beard.' intro_beard_iso <- gsub('\n', ' ', intro_beard_iso) |> stringr::str_squish() req_beard_iso <- 'Please give a name to this beard and summarize what makes this beard unique.' req_beard_iso <- gsub('\n', ' ', req_beard_iso) |> stringr::str_squish() res_beard <- nail_descfreq(beard_cont, introduction = intro_beard_iso, request = req_beard_iso, isolate.groups = TRUE, generate = FALSE) res_beard[[1]] res_beard[[2]] intro_beard <- 'A survey was conducted about beards and 8 types of beards were described. In the data that follow, beards are named B1 to B8.' intro_beard <- gsub('\n', ' ', intro_beard) |> stringr::str_squish() req_beard <- 'Please give a name to each beard and summarize what makes this beard unique.' req_beard <- gsub('\n', ' ', req_beard) |> stringr::str_squish() res_beard <- nail_descfreq(beard_cont, introduction = intro_beard, request = req_beard, generate = TRUE) cat(res_beard$response) text <- res_beard$response titles <- stringr::str_extract_all(text, "\\*\\*B[0-9]+: [^\\*\\*]+\\*\\*")[[1]] titles # for the following code to work, the response must have the beards' # new names with this format: **B1: The Nice beard**, etc. titles <- stringr::str_replace_all(titles, "\\*\\*", "") # remove asterisks names <- stringr::str_extract(titles, ": .+") names <- stringr::str_replace_all(names, ": ", "") # remove the colon and space rownames(beard_cont) <- names library(FactoMineR) res_ca_beard <- CA(beard_cont, graph = F) plot.CA(res_ca_beard, invisible = "col") ### Example 2: children dataset ### data(children) children <- children[1:14, 1:5] |> t() |> as.data.frame() rownames(children) <- c('No education', 'Elementary school', 'Middle school', 'High school', 'University') intro_children <- 'The data used here is a contingency table that summarizes the answers given by different categories of people to the following question: "according to you, what are the reasons that can make a woman of a couple hesitate to have children?". Each row corresponds to a level of education, and columns are reasons.' intro_children <- gsub('\n', ' ', intro_children) |> stringr::str_squish() req_children <- "Please explain the main differences between more educated and less educated couples, when it comes to hesitating to have children." req_children <- gsub('\n', ' ', req_children) |> stringr::str_squish() res_children <- nail_descfreq(children, introduction = intro_children, request = req_children, generate = TRUE) cat(res_children$response) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. ### Example 1: beard dataset ### data(beard_cont) intro_beard_iso <- 'A survey was conducted about beards and 8 types of beards were described. I will give you the results for one type of beard.' intro_beard_iso <- gsub('\n', ' ', intro_beard_iso) |> stringr::str_squish() req_beard_iso <- 'Please give a name to this beard and summarize what makes this beard unique.' req_beard_iso <- gsub('\n', ' ', req_beard_iso) |> stringr::str_squish() res_beard <- nail_descfreq(beard_cont, introduction = intro_beard_iso, request = req_beard_iso, isolate.groups = TRUE, generate = FALSE) res_beard[[1]] res_beard[[2]] intro_beard <- 'A survey was conducted about beards and 8 types of beards were described. In the data that follow, beards are named B1 to B8.' intro_beard <- gsub('\n', ' ', intro_beard) |> stringr::str_squish() req_beard <- 'Please give a name to each beard and summarize what makes this beard unique.' req_beard <- gsub('\n', ' ', req_beard) |> stringr::str_squish() res_beard <- nail_descfreq(beard_cont, introduction = intro_beard, request = req_beard, generate = TRUE) cat(res_beard$response) text <- res_beard$response titles <- stringr::str_extract_all(text, "\\*\\*B[0-9]+: [^\\*\\*]+\\*\\*")[[1]] titles # for the following code to work, the response must have the beards' # new names with this format: **B1: The Nice beard**, etc. titles <- stringr::str_replace_all(titles, "\\*\\*", "") # remove asterisks names <- stringr::str_extract(titles, ": .+") names <- stringr::str_replace_all(names, ": ", "") # remove the colon and space rownames(beard_cont) <- names library(FactoMineR) res_ca_beard <- CA(beard_cont, graph = F) plot.CA(res_ca_beard, invisible = "col") ### Example 2: children dataset ### data(children) children <- children[1:14, 1:5] |> t() |> as.data.frame() rownames(children) <- c('No education', 'Elementary school', 'Middle school', 'High school', 'University') intro_children <- 'The data used here is a contingency table that summarizes the answers given by different categories of people to the following question: "according to you, what are the reasons that can make a woman of a couple hesitate to have children?". Each row corresponds to a level of education, and columns are reasons.' intro_children <- gsub('\n', ' ', intro_children) |> stringr::str_squish() req_children <- "Please explain the main differences between more educated and less educated couples, when it comes to hesitating to have children." req_children <- gsub('\n', ' ', req_children) |> stringr::str_squish() res_children <- nail_descfreq(children, introduction = intro_children, request = req_children, generate = TRUE) cat(res_children$response) ## End(Not run)
Generate an LLM response to analyze QDA data.
nail_qda( dataset, formul, firstvar, lastvar = length(colnames(dataset)), introduction = NULL, request = NULL, model = "llama3", isolate.groups = FALSE, drop.negative = FALSE, proba = 0.05, generate = FALSE )
nail_qda( dataset, formul, firstvar, lastvar = length(colnames(dataset)), introduction = NULL, request = NULL, model = "llama3", isolate.groups = FALSE, drop.negative = FALSE, proba = 0.05, generate = FALSE )
dataset |
a data frame made up of at least two categorical variables (product, panelist) and a set of quantitative variables (sensory attributes). |
formul |
the analyis of variance model to be evaluated for each sensory attribute. |
firstvar |
the index of the first sensory attribute. |
lastvar |
the index of the last sensory attribute. |
introduction |
the introduction for the LLM prompt. |
request |
the request for the LLM prompt. |
model |
the model name ('llama3' by default). |
isolate.groups |
a boolean that indicates whether to give the LLM a single prompt, or one prompt per product. |
drop.negative |
a boolean that indicates whether to drop negative v.test values for interpretation (keeping only positive v.tests). |
proba |
the significance threshold considered to characterize the products (by default 0.05). |
generate |
a boolean that indicates whether to generate the LLM response. If FALSE, the function only returns the prompt. |
This function directly sends a prompt to an LLM. Therefore, to get a consistent answer, we highly recommend to customize the parameters introduction and request and add all relevant information on your data for the LLM. We also recommend renaming the columns with clear, unshortened and unambiguous names.
Additionally, if isolate.groups = TRUE, you will need an introduction and a request that take into account the fact that only one group is analyzed at a time.
A data frame, or a list of data frames, containing the LLM's prompt and response (if generate = TRUE).
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. ### Example 1: QDA data on chocolates with isolate.groups = FALSE ### library(NaileR) library(SensoMineR) data(chocolates) intro_sensochoc <- "Six chocolates were measured according to sensory attributes by a trained panel. I will give you the results from this study. You will have to identify what sets these chocolates apart." intro_sensochoc <- gsub('\n', ' ', intro_sensochoc) |> stringr::str_squish() req_sensochoc <- "Please explain what makes each chocolate different and provide a sensory profile of each chocolate, as well as a name." req_sensochoc <- gsub('\n', ' ', req_sensochoc) |> stringr::str_squish() res_nail_qda <- nail_qda(sensochoc, formul="~Product+Panelist", firstvar = 5, introduction = intro_sensochoc, request = req_sensochoc, model = 'llama3', isolate.groups = FALSE, drop.negative = FALSE, proba = 0.05, generate = TRUE) cat(res_nail_qda$prompt) cat(res_nail_qda$response) ### Example 2: QDA data on chocolates with isolate.groups = TRUE ### library(NaileR) library(SensoMineR) data(chocolates) intro_sensochoc <- "A chocolate was measured according to sensory attributes by a trained panel. I will give you the results from this study. You will have to identify the characteristics of this chocolate." intro_sensochoc <- gsub('\n', ' ', intro_sensochoc) |> stringr::str_squish() req_sensochoc <- "Please provide a detailed sensory profile for this chocolate, as well as a name." req_sensochoc <- gsub('\n', ' ', req_sensochoc) |> stringr::str_squish() res_nail_qda <- nail_qda(sensochoc, formul="~Product+Panelist", firstvar = 5, introduction = intro_sensochoc, request = req_sensochoc, model = 'llama3', isolate.groups = TRUE, drop.negative = FALSE, proba = 0.05, generate = TRUE) cat(res_nail_qda[[1]]$prompt) cat(res_nail_qda[[1]]$response) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. ### Example 1: QDA data on chocolates with isolate.groups = FALSE ### library(NaileR) library(SensoMineR) data(chocolates) intro_sensochoc <- "Six chocolates were measured according to sensory attributes by a trained panel. I will give you the results from this study. You will have to identify what sets these chocolates apart." intro_sensochoc <- gsub('\n', ' ', intro_sensochoc) |> stringr::str_squish() req_sensochoc <- "Please explain what makes each chocolate different and provide a sensory profile of each chocolate, as well as a name." req_sensochoc <- gsub('\n', ' ', req_sensochoc) |> stringr::str_squish() res_nail_qda <- nail_qda(sensochoc, formul="~Product+Panelist", firstvar = 5, introduction = intro_sensochoc, request = req_sensochoc, model = 'llama3', isolate.groups = FALSE, drop.negative = FALSE, proba = 0.05, generate = TRUE) cat(res_nail_qda$prompt) cat(res_nail_qda$response) ### Example 2: QDA data on chocolates with isolate.groups = TRUE ### library(NaileR) library(SensoMineR) data(chocolates) intro_sensochoc <- "A chocolate was measured according to sensory attributes by a trained panel. I will give you the results from this study. You will have to identify the characteristics of this chocolate." intro_sensochoc <- gsub('\n', ' ', intro_sensochoc) |> stringr::str_squish() req_sensochoc <- "Please provide a detailed sensory profile for this chocolate, as well as a name." req_sensochoc <- gsub('\n', ' ', req_sensochoc) |> stringr::str_squish() res_nail_qda <- nail_qda(sensochoc, formul="~Product+Panelist", firstvar = 5, introduction = intro_sensochoc, request = req_sensochoc, model = 'llama3', isolate.groups = TRUE, drop.negative = FALSE, proba = 0.05, generate = TRUE) cat(res_nail_qda[[1]]$prompt) cat(res_nail_qda[[1]]$response) ## End(Not run)
Group textual data according to their similarity, in a context in which the assessors have commented on a set of stimuli.
nail_sort( dataset, name_size = 3, stimulus_id = "stimulus", introduction = "", measure = "", nb_max = 6, generate = FALSE )
nail_sort( dataset, name_size = 3, stimulus_id = "stimulus", introduction = "", measure = "", nb_max = 6, generate = FALSE )
dataset |
a data frame where each row is a stimulus and each column is an assessor. |
name_size |
the maximum number of words in a group name created by the LLM. |
stimulus_id |
the nature of the stimulus. Customizing it is highly recommended. |
introduction |
the introduction to the LLM prompt. |
measure |
the type of measure used in the experiment. |
nb_max |
the maximum number of clusters the LLM can form per assessor. |
generate |
a boolean that indicates whether to generate the LLM response. If FALSE, the function only returns the prompt. |
This function uses a while loop to ensure that the LLM gives the right number of groups. Therefore, customizing the stimulus ID, prompt introduction and measure is highly recommended; a clear prompt can help the LLM finish its task faster.
A list consisting of:
a list of prompts (one per assessor);
a data frame with the group names.
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) data(beard_wide) intro_beard <- "As a barber, you make recommendations based on consumers comments. Examples of consumers descriptions of beards are as follows." intro_beard <- gsub('\n', ' ', intro_beard) |> stringr::str_squish() res <- nail_sort(beard_wide[,1:5], name_size = 3, stimulus_id = "beard", introduction = intro_beard, measure = 'the description was', generate = TRUE) res$dta_sort cat(res$prompt_llm[[1]]) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) data(beard_wide) intro_beard <- "As a barber, you make recommendations based on consumers comments. Examples of consumers descriptions of beards are as follows." intro_beard <- gsub('\n', ' ', intro_beard) |> stringr::str_squish() res <- nail_sort(beard_wide[,1:5], name_size = 3, stimulus_id = "beard", introduction = intro_beard, measure = 'the description was', generate = TRUE) res$dta_sort cat(res$prompt_llm[[1]]) ## End(Not run)
Generate an LLM response to analyze a categorical latent variable, based on answers to open-ended questions.
nail_textual( dataset, num.var, num.text, introduction = NULL, request = NULL, model = "llama3", isolate.groups = TRUE, generate = FALSE )
nail_textual( dataset, num.var, num.text, introduction = NULL, request = NULL, model = "llama3", isolate.groups = TRUE, generate = FALSE )
dataset |
a data frame made up of at least one categorical variable and a textual variable. |
num.var |
the index of the categorical variable to be characterized. |
num.text |
the index of the textual variable that characterizes the categorical variable of interest. |
introduction |
the introduction for the LLM prompt. |
request |
the request made to the LLM. |
model |
the model name ('llama3' by default). |
isolate.groups |
a boolean that indicates whether to give the LLM a single prompt, or one prompt per category. Recommended with long catdes results. |
generate |
a boolean that indicates whether to generate the LLM response. If FALSE, the function only returns the prompt. |
This function directly sends a prompt to an LLM. Therefore, to get a consistent answer, we highly recommend to customize the parameters introduction and request and add all relevant information on your data for the LLM. We also recommend renaming the columns with clear, unshortened and unambiguous names.
Additionally, if isolate.groups = TRUE, you will need an introduction and a request that take into account the fact that only one group is analyzed at a time.
A data frame, or a list of data frames, containing the LLM's prompt and response (if generate = TRUE).
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. ### Example 1: Car alone survey ### library(NaileR) library(dplyr) data(car_alone) sampled_car_alone <- car_alone %>% group_by(car_alone_capable_restrictive) %>% dplyr::sample_frac(0.5) sampled_car_alone <- as.data.frame(sampled_car_alone) intro_car <- "Knowing the impact on the climate, I have made these choices based on the following benefits and constraints..." intro_car <- gsub('\n', ' ', intro_car) |> stringr::str_squish() res_nail_textual <- nail_textual(sampled_car_alone, num.var = 1, num.text = 2, introduction = intro_car, request = NULL, model = 'llama3', isolate.groups = TRUE, generate = TRUE) res_nail_textual[[1]]$response |> cat() res_nail_textual[[3]]$response |> cat() res_nail_textual[[2]]$response |> cat() res_nail_textual[[4]]$response |> cat() ### Example 2: Atomic habits survey ### library(NaileR) library(dplyr) data(atomic_habit_clust) intro_atomic <- "These data were collected after a survey on atomic habits: we asked what people were prepared to change about their daily habits to make the world a better place, what habits they felt able to adopt, what habits were restrictive." intro_atomic <- gsub('\n', ' ', intro_atomic) |> stringr::str_squish() dta_plane <- atomic_habit_clust[,c(32,51)] %>% filter(never_plane_text != 'THAT') sampled_dta_plane <- dta_plane %>% group_by(clust) %>% dplyr::sample_frac(0.75) sampled_dta_plane <- as.data.frame(sampled_dta_plane) summary(sampled_dta_plane) res_nail_textual_plane <- nail_textual(sampled_dta_plane, num.var = 2, num.text = 1, introduction = intro_atomic, request = NULL, model = 'llama3', isolate.groups = TRUE, generate = TRUE) cat(res_nail_textual_plane[[1]]$prompt) cat(res_nail_textual_plane[[1]]$response) cat(res_nail_textual_plane[[2]]$prompt) cat(res_nail_textual_plane[[2]]$response) cat(res_nail_textual_plane[[3]]$prompt) cat(res_nail_textual_plane[[3]]$response) res_nail_textual_plane <- nail_textual(sampled_dta_plane, num.var = 2, num.text = 1, introduction = intro_atomic, request = NULL, model = 'llama3', isolate.groups = FALSE, generate = TRUE) cat(res_nail_textual_plane$prompt) cat(res_nail_textual_plane$response) ### Example 3: Car seat fabrics ### # Drivers of liking and disliking # isolate.groups = F intro_car <- "In this consumer study, a number of car seat fabrics were rated by consumers who gave their reasons for liking or disliking the fabrics. Reasons for disliking the fabrics were reported in group '0', while reasons for liking the fabrics were reported in group '1'." intro_car <- gsub('\n', ' ', intro_car) |> stringr::str_squish() request_car <- "Based on the comments provided by the consumers, please explain the reasons why the fabrics were not appreciated (group '0'), and the reasons why fabrics were appreciated (group '1'). In other words, what are the drivers for disliking and liking the fabrics." request_car <- gsub('\n', ' ', request_car) |> stringr::str_squish() res_nail_textual_fabric <- nail_textual(fabric, num.var = 4, num.text = 3, introduction = intro_car, request = request_car, model = 'llama3', isolate.groups = FALSE, generate = TRUE) cat(res_nail_textual_fabric$response) # Drivers of disliking with a specific prompt # isolate.groups = T intro_car_disliking <- "In this consumer study, a range of car seat fabrics were rated by consumers who gave their reasons for disliking the fabrics. In these data, only the reasons for disliking the fabrics were reported." intro_car_disliking <- gsub('\n', ' ', intro_car_disliking) |> stringr::str_squish() request_car_disliking <- "Based on the comments provided by the consumers, please explain the reasons why the fabrics were not appreciated. In other words, what are the drivers for disliking the fabrics." request_car_disliking <- gsub('\n', ' ', request_car_disliking) |> stringr::str_squish() res_nail_textual_fabric <- nail_textual(fabric, num.var = 4, num.text = 3, introduction = intro_car_disliking, request = request_car_disliking, model = 'llama3', isolate.groups = TRUE, generate = FALSE) ppt <- res_nail_textual_fabric[1] cat(ppt) res_disliking <- ollamar::generate(model = 'llama3', prompt = ppt, output = "df") cat(res_disliking$response) # Drivers of liking with a specific prompt # isolate.groups = T intro_car_liking <- "In this consumer study, a range of car seat fabrics were rated by consumers who gave their reasons for liking the fabrics. In these data, only the reasons for liking the fabrics were reported." intro_car_liking <- gsub('\n', ' ', intro_car_liking) |> stringr::str_squish() request_car_liking <- "Based on the comments provided by the consumers, please explain the reasons why the fabrics were appreciated. In other words, what are the drivers for liking the fabrics." request_car_liking <- gsub('\n', ' ', request_car_liking) |> stringr::str_squish() res_nail_textual_fabric <- nail_textual(fabric, num.var = 4, num.text = 3, introduction = intro_car_liking, request = request_car_liking, model = 'llama3', isolate.groups = TRUE, generate = FALSE) ppt <- res_nail_textual_fabric[2] cat(ppt) res_liking <- ollamar::generate(model = 'llama3', prompt = ppt, output = "df") cat(res_liking$response) ### Example 4: Rorschach inkblots ### # Description of each inkblot # isolate.groups = TRUE intro_rorschach <- "For this study, we asked sixty people to briefly describe one of the inkblots of the Rorschach test." intro_rorschach <- gsub('\n', ' ', intro_rorschach) |> stringr::str_squish() request_rorschach <- "Based on the comments of the 60 people, please give me a description of that inkblot in terms of how it was perceived. Tell me if it was a rather positive or negative perception." request_rorschach <- gsub('\n', ' ', request_rorschach) |> stringr::str_squish() res_nail_textual_rorschach <- nail_textual(rorschach, num.var = 2, num.text = 5, introduction = intro_rorschach, request = request_rorschach, model = 'llama3', isolate.groups = TRUE, generate = FALSE) cat(res_nail_textual_rorschach[[10]]) ppt <- gsub("## Group", "## Stimulus", res_nail_textual_rorschach[[10]]) cat(ppt) res_inkblot_10 <- ollamar::generate(model = 'llama3', prompt = ppt, output = "df") cat(res_inkblot_10$response) cat(res_nail_textual_rorschach[[5]]) ppt <- gsub("## Group", "## Stimulus", res_nail_textual_rorschach[[5]]) cat(ppt) res_inkblot_5 <- ollamar::generate(model = 'llama3', prompt = ppt, output = "df") cat(res_inkblot_5$response) #Comparison of panels rorschach_10 <- droplevels(rorschach[rorschach$Inkblot=="10",]) intro_rorschach <- "For this study, we asked sixty people to briefly describe one of the inkblots of the Rorschach test. The sixty people belonged to three different panels, with 20 people per panel." intro_rorschach <- gsub('\n', ' ', intro_rorschach) |> stringr::str_squish() request_rorschach <- "Based on the comments of the 60 people, please tell me what is common from panel to panel and what is specific to each panel in terms of the perception of the inkblot." request_rorschach <- gsub('\n', ' ', request_rorschach) |> stringr::str_squish() res_nail_textual_rorschach <- nail_textual(rorschach_10, num.var = 1, num.text = 5, introduction = intro_rorschach, request = request_rorschach, model = 'llama3', isolate.groups = FALSE, generate = TRUE) cat(res_nail_textual_rorschach$prompt) cat(res_nail_textual_rorschach$response) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. ### Example 1: Car alone survey ### library(NaileR) library(dplyr) data(car_alone) sampled_car_alone <- car_alone %>% group_by(car_alone_capable_restrictive) %>% dplyr::sample_frac(0.5) sampled_car_alone <- as.data.frame(sampled_car_alone) intro_car <- "Knowing the impact on the climate, I have made these choices based on the following benefits and constraints..." intro_car <- gsub('\n', ' ', intro_car) |> stringr::str_squish() res_nail_textual <- nail_textual(sampled_car_alone, num.var = 1, num.text = 2, introduction = intro_car, request = NULL, model = 'llama3', isolate.groups = TRUE, generate = TRUE) res_nail_textual[[1]]$response |> cat() res_nail_textual[[3]]$response |> cat() res_nail_textual[[2]]$response |> cat() res_nail_textual[[4]]$response |> cat() ### Example 2: Atomic habits survey ### library(NaileR) library(dplyr) data(atomic_habit_clust) intro_atomic <- "These data were collected after a survey on atomic habits: we asked what people were prepared to change about their daily habits to make the world a better place, what habits they felt able to adopt, what habits were restrictive." intro_atomic <- gsub('\n', ' ', intro_atomic) |> stringr::str_squish() dta_plane <- atomic_habit_clust[,c(32,51)] %>% filter(never_plane_text != 'THAT') sampled_dta_plane <- dta_plane %>% group_by(clust) %>% dplyr::sample_frac(0.75) sampled_dta_plane <- as.data.frame(sampled_dta_plane) summary(sampled_dta_plane) res_nail_textual_plane <- nail_textual(sampled_dta_plane, num.var = 2, num.text = 1, introduction = intro_atomic, request = NULL, model = 'llama3', isolate.groups = TRUE, generate = TRUE) cat(res_nail_textual_plane[[1]]$prompt) cat(res_nail_textual_plane[[1]]$response) cat(res_nail_textual_plane[[2]]$prompt) cat(res_nail_textual_plane[[2]]$response) cat(res_nail_textual_plane[[3]]$prompt) cat(res_nail_textual_plane[[3]]$response) res_nail_textual_plane <- nail_textual(sampled_dta_plane, num.var = 2, num.text = 1, introduction = intro_atomic, request = NULL, model = 'llama3', isolate.groups = FALSE, generate = TRUE) cat(res_nail_textual_plane$prompt) cat(res_nail_textual_plane$response) ### Example 3: Car seat fabrics ### # Drivers of liking and disliking # isolate.groups = F intro_car <- "In this consumer study, a number of car seat fabrics were rated by consumers who gave their reasons for liking or disliking the fabrics. Reasons for disliking the fabrics were reported in group '0', while reasons for liking the fabrics were reported in group '1'." intro_car <- gsub('\n', ' ', intro_car) |> stringr::str_squish() request_car <- "Based on the comments provided by the consumers, please explain the reasons why the fabrics were not appreciated (group '0'), and the reasons why fabrics were appreciated (group '1'). In other words, what are the drivers for disliking and liking the fabrics." request_car <- gsub('\n', ' ', request_car) |> stringr::str_squish() res_nail_textual_fabric <- nail_textual(fabric, num.var = 4, num.text = 3, introduction = intro_car, request = request_car, model = 'llama3', isolate.groups = FALSE, generate = TRUE) cat(res_nail_textual_fabric$response) # Drivers of disliking with a specific prompt # isolate.groups = T intro_car_disliking <- "In this consumer study, a range of car seat fabrics were rated by consumers who gave their reasons for disliking the fabrics. In these data, only the reasons for disliking the fabrics were reported." intro_car_disliking <- gsub('\n', ' ', intro_car_disliking) |> stringr::str_squish() request_car_disliking <- "Based on the comments provided by the consumers, please explain the reasons why the fabrics were not appreciated. In other words, what are the drivers for disliking the fabrics." request_car_disliking <- gsub('\n', ' ', request_car_disliking) |> stringr::str_squish() res_nail_textual_fabric <- nail_textual(fabric, num.var = 4, num.text = 3, introduction = intro_car_disliking, request = request_car_disliking, model = 'llama3', isolate.groups = TRUE, generate = FALSE) ppt <- res_nail_textual_fabric[1] cat(ppt) res_disliking <- ollamar::generate(model = 'llama3', prompt = ppt, output = "df") cat(res_disliking$response) # Drivers of liking with a specific prompt # isolate.groups = T intro_car_liking <- "In this consumer study, a range of car seat fabrics were rated by consumers who gave their reasons for liking the fabrics. In these data, only the reasons for liking the fabrics were reported." intro_car_liking <- gsub('\n', ' ', intro_car_liking) |> stringr::str_squish() request_car_liking <- "Based on the comments provided by the consumers, please explain the reasons why the fabrics were appreciated. In other words, what are the drivers for liking the fabrics." request_car_liking <- gsub('\n', ' ', request_car_liking) |> stringr::str_squish() res_nail_textual_fabric <- nail_textual(fabric, num.var = 4, num.text = 3, introduction = intro_car_liking, request = request_car_liking, model = 'llama3', isolate.groups = TRUE, generate = FALSE) ppt <- res_nail_textual_fabric[2] cat(ppt) res_liking <- ollamar::generate(model = 'llama3', prompt = ppt, output = "df") cat(res_liking$response) ### Example 4: Rorschach inkblots ### # Description of each inkblot # isolate.groups = TRUE intro_rorschach <- "For this study, we asked sixty people to briefly describe one of the inkblots of the Rorschach test." intro_rorschach <- gsub('\n', ' ', intro_rorschach) |> stringr::str_squish() request_rorschach <- "Based on the comments of the 60 people, please give me a description of that inkblot in terms of how it was perceived. Tell me if it was a rather positive or negative perception." request_rorschach <- gsub('\n', ' ', request_rorschach) |> stringr::str_squish() res_nail_textual_rorschach <- nail_textual(rorschach, num.var = 2, num.text = 5, introduction = intro_rorschach, request = request_rorschach, model = 'llama3', isolate.groups = TRUE, generate = FALSE) cat(res_nail_textual_rorschach[[10]]) ppt <- gsub("## Group", "## Stimulus", res_nail_textual_rorschach[[10]]) cat(ppt) res_inkblot_10 <- ollamar::generate(model = 'llama3', prompt = ppt, output = "df") cat(res_inkblot_10$response) cat(res_nail_textual_rorschach[[5]]) ppt <- gsub("## Group", "## Stimulus", res_nail_textual_rorschach[[5]]) cat(ppt) res_inkblot_5 <- ollamar::generate(model = 'llama3', prompt = ppt, output = "df") cat(res_inkblot_5$response) #Comparison of panels rorschach_10 <- droplevels(rorschach[rorschach$Inkblot=="10",]) intro_rorschach <- "For this study, we asked sixty people to briefly describe one of the inkblots of the Rorschach test. The sixty people belonged to three different panels, with 20 people per panel." intro_rorschach <- gsub('\n', ' ', intro_rorschach) |> stringr::str_squish() request_rorschach <- "Based on the comments of the 60 people, please tell me what is common from panel to panel and what is specific to each panel in terms of the perception of the inkblot." request_rorschach <- gsub('\n', ' ', request_rorschach) |> stringr::str_squish() res_nail_textual_rorschach <- nail_textual(rorschach_10, num.var = 1, num.text = 5, introduction = intro_rorschach, request = request_rorschach, model = 'llama3', isolate.groups = FALSE, generate = TRUE) cat(res_nail_textual_rorschach$prompt) cat(res_nail_textual_rorschach$response) ## End(Not run)
These data were collected after a survey on the nutri-score. Participants were asked various questions about their views on the nutri-score, and about their eating habits.
nutriscore
nutriscore
A data frame with 112 rows (participants) and 36 columns (questions).
Anaëlle YANNIC and Jessie PICOT, students at l'Institut Agro Rennes-Angers
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) library(FactoMineR) data(nutriscore) res_mca_nutriscore <- MCA(nutriscore, quali.sup = 17:36, ncp = 15, level.ventil = 0.05, graph = FALSE) res_hcpc_nutriscore <- HCPC(res_mca_nutriscore, nb.clust = 3, graph = FALSE) don_clust_nutriscore <- res_hcpc_nutriscore$data.clust intro_nutri <- 'These data were collected after a survey on the nutri-score. Participants were asked various questions about their views on the nutri-score, and about their eating habits. Participants were split into groups according to their answers.' intro_nutri <- gsub('\n', ' ', intro_nutri) |> stringr::str_squish() req_nutri <- 'Please summarize the characteristics of each group. Then, give each group a new name, based on your conclusions.' req_nutri <- gsub('\n', ' ', req_nutri)|> stringr::str_squish() res_nutriscore <- nail_catdes(don_clust_nutriscore, num.var = 37, introduction = intro_nutri, request = req_nutri, drop.negative = TRUE) cat(res_nutriscore$response) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) library(FactoMineR) data(nutriscore) res_mca_nutriscore <- MCA(nutriscore, quali.sup = 17:36, ncp = 15, level.ventil = 0.05, graph = FALSE) res_hcpc_nutriscore <- HCPC(res_mca_nutriscore, nb.clust = 3, graph = FALSE) don_clust_nutriscore <- res_hcpc_nutriscore$data.clust intro_nutri <- 'These data were collected after a survey on the nutri-score. Participants were asked various questions about their views on the nutri-score, and about their eating habits. Participants were split into groups according to their answers.' intro_nutri <- gsub('\n', ' ', intro_nutri) |> stringr::str_squish() req_nutri <- 'Please summarize the characteristics of each group. Then, give each group a new name, based on your conclusions.' req_nutri <- gsub('\n', ' ', req_nutri)|> stringr::str_squish() res_nutriscore <- nail_catdes(don_clust_nutriscore, num.var = 37, introduction = intro_nutri, request = req_nutri, drop.negative = TRUE) cat(res_nutriscore$response) ## End(Not run)
These data were collected after a study on the perception of food quality. Participants were given 9 French logos; they had to rate, on a scale from 0 (not at all) to 10 (absolutely), how much a product bearing them aligned with their own perception of quality.
quality
quality
A data frame with 55 rows and 9 columns. Here is the list of logos:
AB: organic;
Label Rouge: superior quality (from the taste, process, packaging...);
FairTrade: decent wages and working conditions for the producers;
Bleu Blanc Coeur: diverse and balanced diet for the livestock;
AOC: controlled designation of origin;
Produit en Bretagne: processed in Brittany;
Viandes de France: livestock bred, grown and slaughtered in France, with respectful living conditions;
Nourri sans OGM: no GMOs in livestock food;
Médailles Agro: a prize won at a yearly contest based on taste.
Sébastien Lê, applied mathematics department, Institut Agro Rennes-Angers
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) data(quality) colnames(quality) <- c("Agriculture biologique", "Label Rouge", "FairTrade", "Bleu Blanc Coeur", "Appelation d'origine contrôlée", "Produit en Bretagne", "Viandes de France", "Nourri sans OGM", "Médailles Agro") res_pca_quality <- FactoMineR::PCA(quality, graph = FALSE) quali_work <- res_pca_quality$ind$coord |> as.data.frame() quali_work <- quali_work[,1] |> cbind(quality) intro_quali <- "These data were collected after a study on the perception of food quality. Participants were given 9 French logos; they had to rate, on a scale from 0 (not at all) to 10 (absolutely), how much a product bearing them aligned with their own perception of quality." intro_quali <- gsub('\n', ' ', intro_quali) |> stringr::str_squish() res_quality <- nail_condes(quali_work, num.var = 1, quanti.cat = c('Higher quality', 'Lower quality', 'Neutral'), introduction = intro_quali, generate = FALSE) ppt <- gsub('characteristics', 'opinions', res_quality$prompt) res_quality <- ollamar::generate('llama3', ppt, output = 'df') cat(res_quality$response) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) data(quality) colnames(quality) <- c("Agriculture biologique", "Label Rouge", "FairTrade", "Bleu Blanc Coeur", "Appelation d'origine contrôlée", "Produit en Bretagne", "Viandes de France", "Nourri sans OGM", "Médailles Agro") res_pca_quality <- FactoMineR::PCA(quality, graph = FALSE) quali_work <- res_pca_quality$ind$coord |> as.data.frame() quali_work <- quali_work[,1] |> cbind(quality) intro_quali <- "These data were collected after a study on the perception of food quality. Participants were given 9 French logos; they had to rate, on a scale from 0 (not at all) to 10 (absolutely), how much a product bearing them aligned with their own perception of quality." intro_quali <- gsub('\n', ' ', intro_quali) |> stringr::str_squish() res_quality <- nail_condes(quali_work, num.var = 1, quanti.cat = c('Higher quality', 'Lower quality', 'Neutral'), introduction = intro_quali, generate = FALSE) ppt <- gsub('characteristics', 'opinions', res_quality$prompt) res_quality <- ollamar::generate('llama3', ppt, output = 'df') cat(res_quality$response) ## End(Not run)
This dataset was initially collected to understand the perception of the Rorschach test.
rorschach
rorschach
A data frame with 600 rows and 5 columns:
The Panel effect (3 panels)
The Inkblot effect (10 inkblots)
The Panelist effect (20 panelists par panel)
The interaction Panel and Panelist
The perception of the inkblot
Applied mathematics department, Institut Agro Rennes-Angers
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) data(rorschach) ### Example 1: perception of the inkblots for one panel ### intro_rorschach <- "For this study, we asked 20 people to briefly describe the 10 inkblots of the Rorschach test." intro_rorschach <- gsub('\n', ' ', intro_rorschach) |> stringr::str_squish() request_rorschach <- "Based on the comments of the 20 people, please give me a description of each inkblot in terms of how it was perceived. Tell me if it was a rather positive or negative perception." request_rorschach <- gsub('\n', ' ', request_rorschach) |> stringr::str_squish() rorschach_A <- droplevels(rorschach[rorschach$Panel=="A",]) res_nail_textual_rorschach <- nail_textual(rorschach_A, num.var = 2, num.text = 5, introduction = intro_rorschach, request = request_rorschach, model = 'llama3', isolate.groups = FALSE, generate = FALSE) cat(res_nail_textual_rorschach$prompt) ppt <- gsub("## Group", "## Inkblot", res_nail_textual_rorschach$prompt) cat(ppt) res_inkblot <- ollamar::generate(model = 'llama3', prompt = ppt, output = "df") cat(res_inkblot$response) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) data(rorschach) ### Example 1: perception of the inkblots for one panel ### intro_rorschach <- "For this study, we asked 20 people to briefly describe the 10 inkblots of the Rorschach test." intro_rorschach <- gsub('\n', ' ', intro_rorschach) |> stringr::str_squish() request_rorschach <- "Based on the comments of the 20 people, please give me a description of each inkblot in terms of how it was perceived. Tell me if it was a rather positive or negative perception." request_rorschach <- gsub('\n', ' ', request_rorschach) |> stringr::str_squish() rorschach_A <- droplevels(rorschach[rorschach$Panel=="A",]) res_nail_textual_rorschach <- nail_textual(rorschach_A, num.var = 2, num.text = 5, introduction = intro_rorschach, request = request_rorschach, model = 'llama3', isolate.groups = FALSE, generate = FALSE) cat(res_nail_textual_rorschach$prompt) ppt <- gsub("## Group", "## Inkblot", res_nail_textual_rorschach$prompt) cat(ppt) res_inkblot <- ollamar::generate(model = 'llama3', prompt = ppt, output = "df") cat(res_inkblot$response) ## End(Not run)
Compute a similarity score, on a scale ranging from 0 (totally different) to 100 (the exact same), between two character strings.
sim_llm(textA, textB)
sim_llm(textA, textB)
textA , textB
|
two character strings. |
The similarity score is generated by an LLM. Therefore, the result might vary if the function is run several times.
An integer between 0 and 100.
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. textA <- "Participant A was described as a nice, outgoing man, with a friendly attitude." textB <- "Participant A was an extroverted and caring individual." sim_llm(textA, textB) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. textA <- "Participant A was described as a nice, outgoing man, with a friendly attitude." textB <- "Participant A was an extroverted and caring individual." sim_llm(textA, textB) ## End(Not run)
These data were collected after a survey on food waste, with participants describing their habits.
waste
waste
A data frame with 180 rows (participants) and 77 columns (questions).
Héloïse BILLES and Amélie RATEAU, students at l'Institut Agro Rennes-Angers
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) library(FactoMineR) data(waste) waste <- waste[-14] res_mca_waste <- MCA(waste, quali.sup = c(1,2,50:76), ncp = 35, level.ventil = 0.05, graph = FALSE) res_hcpc_waste <- HCPC(res_mca_waste, nb.clust = 3, graph = FALSE) don_clust_waste <- res_hcpc_waste$data.clust intro_waste <- 'These data were collected after a survey on food waste, with participants describing their habits.' intro_waste <- gsub('\n', ' ', intro_waste) |> stringr::str_squish() req_waste <- 'Please summarize the characteristics of each group. Then, give each group a new name, based on your conclusions. Finally, give each group a grade between 0 and 10, based on how wasteful they are with food: 0 being "not at all", 10 being "absolutely".' req_waste <- gsub('\n', ' ', req_waste) |> stringr::str_squish() res_waste <- nail_catdes(don_clust_waste, num.var = ncol(don_clust_waste), introduction = intro_waste, request = req_waste, drop.negative = TRUE) cat(res_waste$response) ## End(Not run)
## Not run: # Processing time is often longer than ten seconds # because the function uses a large language model. library(NaileR) library(FactoMineR) data(waste) waste <- waste[-14] res_mca_waste <- MCA(waste, quali.sup = c(1,2,50:76), ncp = 35, level.ventil = 0.05, graph = FALSE) res_hcpc_waste <- HCPC(res_mca_waste, nb.clust = 3, graph = FALSE) don_clust_waste <- res_hcpc_waste$data.clust intro_waste <- 'These data were collected after a survey on food waste, with participants describing their habits.' intro_waste <- gsub('\n', ' ', intro_waste) |> stringr::str_squish() req_waste <- 'Please summarize the characteristics of each group. Then, give each group a new name, based on your conclusions. Finally, give each group a grade between 0 and 10, based on how wasteful they are with food: 0 being "not at all", 10 being "absolutely".' req_waste <- gsub('\n', ' ', req_waste) |> stringr::str_squish() res_waste <- nail_catdes(don_clust_waste, num.var = ncol(don_clust_waste), introduction = intro_waste, request = req_waste, drop.negative = TRUE) cat(res_waste$response) ## End(Not run)