Package: textTinyR 1.1.8
textTinyR: Text Processing for Small or Big Data Files
It offers functions for splitting, parsing, tokenizing and creating a vocabulary for big text data files. Moreover, it includes functions for building a document-term matrix and extracting information from those (term-associations, most frequent terms). It also embodies functions for calculating token statistics (collocations, look-up tables, string dissimilarities) and functions to work with sparse matrices. Lastly, it includes functions for Word Vector Representations (i.e. 'GloVe', 'fasttext') and incorporates functions for the calculation of (pairwise) text document dissimilarities. The source code is based on 'C++11' and exported in R through the 'Rcpp', 'RcppArmadillo' and 'BH' packages.
Authors:
textTinyR_1.1.8.tar.gz
textTinyR_1.1.8.tar.gz(r-4.5-noble)textTinyR_1.1.8.tar.gz(r-4.4-noble)
textTinyR_1.1.8.tgz(r-4.4-emscripten)textTinyR_1.1.8.tgz(r-4.3-emscripten)
textTinyR.pdf |textTinyR.html✨
textTinyR/json (API)
NEWS
# Install 'textTinyR' in R: |
install.packages('textTinyR', repos = c('https://cran.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/mlampros/texttinyr/issues
Last updated 12 months agofrom:2662a61144. Checks:OK: 2. Indexed: no.
Target | Result | Date |
---|---|---|
Doc / Vignettes | OK | Oct 30 2024 |
R-4.5-linux-x86_64 | OK | Oct 30 2024 |
Exports:batch_computebig_tokenize_transformbytes_convertercluster_frequencyCOS_TEXTcosine_distanceCount_Rowsdense_2sparsedice_distancedims_of_word_vecsDoc2VecJACCARD_DICElevenshtein_distanceload_sparse_binarymatrix_sparsityread_charactersread_rowssave_sparse_binaryselect_predictorssparse_Meanssparse_Sumssparse_term_matrixTEXT_DOC_DISSIMtext_file_parsertext_intersecttoken_statstokenize_transform_texttokenize_transform_vec_docsutf_localevocabulary_parser
Dependencies:BHdata.tablelatticeMatrixR6RcppRcppArmadillo
Functionality of the textTinyR package
Rendered fromfunctionality_of_textTinyR_package.Rmd
usingknitr::rmarkdown
on Oct 30 2024.Last update: 2021-10-13
Started: 2017-01-07
Word vectors - doc2vec - text clustering
Rendered fromword_vectors_doc2vec.Rmd
usingknitr::rmarkdown
on Oct 30 2024.Last update: 2021-10-13
Started: 2018-04-03