Package: tmfast 0.1.1

D. Hicks

tmfast: Fast Topic Models Using Varimax

Fits topic models using varimax-rotated principal component analysis (PCA), following the "vintage factor analysis" approach of Rohe & Zheng (2020) <doi:10.48550/arXiv.2004.05387>. Leverages truncated PCA via 'irlba' for sparse matrices, enabling fast model fitting on large corpora. Includes an information-theoretic approach to vocabulary selection, 'broom'-compatible tidiers for extracting word-topic and topic-document matrices into a tidy data workflow, and samplers for constructing simulated corpora for benchmarking and method evaluation.

Authors:D. Hicks [aut, cre, cph]

tmfast_0.1.1.tar.gz
tmfast_0.1.1.tar.gz(r-4.7-any)tmfast_0.1.1.tar.gz(r-4.6-any)
tmfast_0.1.1.tgz(r-4.6-emscripten)
manual.pdf |manual.html
card.svg |card.png
tmfast/json (API)
NEWS

# Install 'tmfast' in R:
install.packages('tmfast', repos = c('https://cran.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/dhicks/tmfast/issues

Pkgdown/docs site:https://dhicks.github.io

On CRAN:

Conda:

3.00 score 25 exports 33 dependencies

Last updated from:1342f37e06. Checks:4 OK. Indexed: no.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK297
source / vignettesOK479
linux-release-x86_64OK317
wasm-releaseOK263

Exports:build_matrixcompare_betasdraw_corpusentropyexpected_entropyfit_varimaxhellingerinsert_topicsjournal_specificloadingsndHndRpeak_alphardirichletrenormrotationscoressolve_powertarget_powertidytidy_alltmfasttsneumapvarimax_irlba

Dependencies:assertthatclicpp11dplyrgenericsglueGPArotationirlbajaneaustenrlatticelifecyclemagrittrMatrixmnormtnlmepillarpkgconfigpsychpurrrR6RcpprlangSnowballCstringistringrtibbletidyrtidyselecttidytexttokenizersutf8vctrswithr

Fast topic modeling with real books

Rendered fromrealbooks.Rmdusingknitr::rmarkdownon May 30 2026.

Last update: 2026-05-30
Started: 2026-05-30

Fitting topic models (and simulating text data) with tmfast

Rendered fromsimulated.Rmdusingknitr::rmarkdownon May 30 2026.

Last update: 2026-05-30
Started: 2026-05-30

Readme and manuals

Help Manual

Help pageTopics
Fitting "topic models" with PCA+varimaxtmfast-package
Convert a long dataframe to a wide (sparse) matrixbuild_matrix
Compare topic-word distributions using Hellinger distancecompare_betas
Draw a collection of documentsdraw_corpus
Entropy of a distributionentropy
Expected entropy for samples from a Dirichlet distributionexpected_entropy
Given a (rank 'n') PCA fit, return a rank 'k < n' varimax fitfit_varimax
Hellinger distanceshellinger hellinger.data.frame hellinger.Matrix hellinger.matrix
Insert a topic model into a fitted 'tmfast'insert_topics
"Journal-specific" simulation scenariojournal_specific
Extract a PCA/varimax loadings matrixloadings loadings.default
Information gain (uniform distribution)ndH
Information gain (length-proportional distribution)ndR
Alpha parameter with a single peakpeak_alpha
Project new data into PCA score spacepredict.varimaxes
Sample from the Dirichlet distributionrdirichlet
Renormalize tidied distributionsrenorm
Extract varimax rotationrotation
Extract item scores from a fitted PCA/varimax modelscores
Solve the equation to find the desired exponentsolve_power
Find target power for renormalizationtarget_power
Extract gamma or beta matrices for all topicstidy_all
Extract beta and gamma matrices from 'tmfast' objectstidy.tmfast
Fit a topic model using PCA+varimaxtmfast
Discursive space using t-SNEtsne tsne.data.frame tsne.STM tsne.tmfast
Discursive space using UMAPumap umap.matrix umap.STM umap.tmfast
Fit a varimax-rotated PCA using irlbavarimax_irlba