This package is considered a duplicate. The official version of this package is found at:https://psychbruce.r-universe.dev/PsychWordVec

Package: PsychWordVec 2023.9

Han-Wu-Shuang Bao

PsychWordVec: Word Embedding Research Framework for Psychological Science

An integrative toolbox of word embedding research that provides: (1) a collection of 'pre-trained' static word vectors in the '.RData' compressed format <https://psychbruce.github.io/WordVector_RData.pdf>; (2) a series of functions to process, analyze, and visualize word vectors; (3) a range of tests to examine conceptual associations, including the Word Embedding Association Test <doi:10.1126/science.aal4230> and the Relative Norm Distance <doi:10.1073/pnas.1720347115>, with permutation test of significance; (4) a set of training methods to locally train (static) word vectors from text corpora, including 'Word2Vec' <arxiv:1301.3781>, 'GloVe' <doi:10.3115/v1/D14-1162>, and 'FastText' <arxiv:1607.04606>; (5) a group of functions to download 'pre-trained' language models (e.g., 'GPT', 'BERT') and extract contextualized (dynamic) word vectors (based on the R package 'text').

Authors:Han-Wu-Shuang Bao [aut, cre]

PsychWordVec_2023.9.tar.gz
PsychWordVec_2023.9.tar.gz(r-4.5-noble)PsychWordVec_2023.9.tar.gz(r-4.4-noble)
PsychWordVec_2023.9.tgz(r-4.4-emscripten)PsychWordVec_2023.9.tgz(r-4.3-emscripten)
PsychWordVec.pdf |PsychWordVec.html✨
PsychWordVec/json (API)
NEWS

# Install 'PsychWordVec' in R:

install.packages('PsychWordVec', repos = c('https://cran.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/psychbruce/psychwordvec/issues

Pkgdown site:https://psychbruce.github.io

Uses libs:

openjdk– OpenJDK Java runtime, using Hotspot JIT

Datasets:

demodata - Demo data (pre-trained using word2vec on Google News; 8000 vocab, 300 dims).

On CRAN:

openjdk

1.70 score 10 scripts 459 downloads 34 exports 234 dependencies

Last updated 1 years agofrom:17d922c38c. Checks:2 OK. Indexed: no.

Target	Result	Latest binary
Doc / Vignettes	OK	Feb 26 2025
R-4.5-linux	OK	Feb 26 2025

Exports:as_embed as_wordvec cc cos_dist cos_sim cosine_similarity data_transform data_wordvec_load data_wordvec_subset dict_expand dict_reliability get_wordvec load_embed load_wordvec most_similar normalize orth_procrustes pair_similarity pattern plot_network plot_similarity plot_wordvec plot_wordvec_tSNE sum_wordvec tab_similarity test_RND test_WEAT text_init text_model_download text_model_remove text_to_vec text_unmask tokenize train_wordvec

Dependencies:abind afex askpass backports base64enc bayestestR bit bit64 boot broom broom.mixed bruceR bslib cachem car carData cellranger checkmate class cli clipr clock cluster coda codetools colorspace commonmark corpcor corrplot cowplot cpp11 crayon curl data.table datawizard Deriv diagram dials DiceDesign digest doBy doFuture dplyr effectsize effsize emmeans estimability evaluate fansi farver fastmap fastTextR fdrtool float fontawesome forcats foreach foreign Formula fs furrr future future.apply generics ggplot2 ggrepel ggwordcloud glasso globals glue gower GPArotation GPfit gridExtra gridtext gtable gtools hardhat haven here highr Hmisc hms htmlTable htmltools htmlwidgets httr igraph insight interactions ipred isoband ISOcodes iterators jpeg jquerylib jsonlite jtools KernSmooth knitr labeling lattice lava lavaan lgr lhs lifecycle listenv lme4 lmerTest lpSolve lubridate magrittr mallet markdown MASS Matrix MatrixExtra MatrixModels mediation memoise mgcv microbenchmark mime minqa mlapi mnormt modelenv modelr munsell mvtnorm ngram nlme nloptr nnet numDeriv openssl pander parallelly parameters parsnip pbapply pbivnorm pbkrtest performance pillar pkgconfig plyr png prettyunits prodlim progress progressr psych purrr qgraph quadprog quantreg R.methodsS3 R.oo R.utils R6 rappdirs rbibutils RColorBrewer Rcpp RcppArmadillo RcppEigen RcppProgress RcppTOML Rdpack readr readxl recipes reformulas rematch reshape2 reticulate rgl RhpcBLASctl rio rJava rlang rmarkdown rpart rprojroot rsample rsparse RSpectra rstudioapi Rtsne sandwich sass scales sfd shape slam slider SparseM sparsevctrs SQUAREM stopwords stringi stringr survival sys texreg text text2vec textmineR tibble tidyr tidyselect timechange timeDate tinytex topics tune tzdb utf8 vctrs viridis viridisLite vroom warp withr word2vec workflows writexl xfun xml2 yaml yardstick zoo

Help page	Topics
Word vectors data class: 'wordvec' and 'embed'.	as_embed as_wordvec pattern [.embed
Cosine similarity/distance between two vectors.	cosine_similarity cos_dist cos_sim
Transform plain text of word vectors into 'wordvec' (data.table) or 'embed' (matrix), saved in a compressed ".RData" file.	data_transform
Load word vectors data ('wordvec' or 'embed') from ".RData" file.	data_wordvec_load load_embed load_wordvec
Extract a subset of word vectors data (with S3 methods).	data_wordvec_subset subset.embed subset.wordvec
Demo data (pre-trained using word2vec on Google News; 8000 vocab, 300 dims).	demodata
Expand a dictionary from the most similar words.	dict_expand
Reliability analysis and PCA of a dictionary.	dict_reliability
Extract word vector(s).	get_wordvec
Find the Top-N most similar words.	most_similar
Normalize all word vectors to the unit length 1.	normalize
Orthogonal Procrustes rotation for matrix alignment.	orth_procrustes
Compute a matrix of cosine similarity/distance of word pairs.	pair_similarity
Visualize a (partial correlation) network graph of words.	plot_network
Visualize cosine similarity of word pairs.	plot_similarity
Visualize word vectors.	plot_wordvec
Visualize word vectors with dimensionality reduced using t-SNE.	plot_wordvec_tSNE
Calculate the sum vector of multiple words.	sum_wordvec
Tabulate cosine similarity/distance of word pairs.	tab_similarity
Relative Norm Distance (RND) analysis.	test_RND
Word Embedding Association Test (WEAT) and Single-Category WEAT.	test_WEAT
Install required Python modules in a new conda environment and initialize the environment, necessary for all 'text_*' functions designed for contextualized word embeddings.	text_init
Download pre-trained language models from HuggingFace.	text_model_download
Remove downloaded models from the local .cache folder.	text_model_remove
Extract contextualized word embeddings from transformers (pre-trained language models).	text_to_vec
<Deprecated> Fill in the blank mask(s) in a query (sentence).	text_unmask
Tokenize raw text for training word embeddings.	tokenize
Train static word embeddings using the Word2Vec, GloVe, or FastText algorithm.	train_wordvec

Package: PsychWordVec 2023.9

PsychWordVec: Word Embedding Research Framework for Psychological Science

Citation

Readme and manuals

Help Manual

Usage by other packages (reverse dependencies)