NEWS

topicmodels 0.2-17 (2024-08-14)

Internal changes to C++ code to make use of R_NO_REMAP by preprending Rf_.

topicmodels 0.2-16 (2024-01-09)

corpus.JSS.papers was removed as suggested package and the dataset added to the package to ensure that the vignette code can be successfully executed even if the package is not available for installation.

topicmodels 0.2-15 (2023-11-27)

Clarified the documentation for class "LDA_Gibbscontrol" to indicate that iter refers to the number of iterations in addition to the burnin iterations specified.
Typo resolved in ctm.c referring to long int while only int.

topicmodels 0.2-14 (2023-03-31)

System requirement C++11 removed from the DESCRIPTION file.

topicmodels 0.2-13 (2022-12-06)

sprintf replaced by snprintf in external code to avoid warnings.
build_graph removed due to archival of lasso2 on CRAN.

topicmodels 0.2-12 (2021-01-29)

Maintainer e-mail changed.

topicmodels 0.2-11 (2020-04-19)

The limit of file names in the external code was extended from 100 to 260 and a check for the length of the prefix included. Thanks to Julia Silge for pointing the issue out.

topicmodels 0.2-10

The symbols and functions exposed by the external code were reduced. Thanks to Prof. Ripley for pointing out the problem.

topicmodels 0.2-9 (2019-12-03)

Issues concerning installation failures with gcc trunk aka 10 were fixed. Thanks to Prof. Ripley for pointing out the problem.
The authors field was improved to contain also the contributors of external code.

topicmodels 0.2-8 (2018-12-21)

Improve protection of R objects in the external code.

topicmodels 0.2-7 (2017-11-03)

Code shown for either installing package corpus.JSS.papers and loading the data from the package or obtaining the data with package OAIHarvester is now the same as actually used in the vignette.

topicmodels 0.2-6 (2017-04-18)

Protections for R objects added in the external code. Thanks to Tomas Kalibera for pointing out the potential problems.

topicmodels 0.2-5 (2017-02-28)

C++11 added to SystemRequirements. Thanks to Prof. Ripley for pointing out the problem.
The vignette was slightly modified with respect to the retrieval of topics for Volume 24. This was necessary because JSS has now different identifiers due to the change in its web page. Thanks to Bruce Spencer for pointing the problem out.
Package now uses registration for native (C) routines.

topicmodels 0.2-4 (2016-05-23)

Issues concerning memory deallocation in the C++ code were fixed. Thanks to Prof. Ripley for pointing the problem out and providing log files to help to identify the problem.

topicmodels 0.2-3 (2016-02-17)

Issues concerning memory deallocation in the C and C++ code and inclusion of headers were fixed. Thanks to Prof. Ripley for pointing the problem out, giving advise on how to fix the issues and providing log files to help to identify the problems.

topicmodels 0.2-2 (2015-07-01)

A bug in the CTM implementation which led to unnecessary use of memory fixed. Thanks to Florian Schwendinger for pointing the issue out.
Functions from package stats are now correctly imported before being used.

topicmodels 0.2-1 (2014-06-11)

tm version >=0.6 is required.
The data set AssociatedPress as well as other code checking document term matrices now conforms to the data structure of document term matrices in tm version >=0.6.

topicmodels 0.2-0 (2014-04-29)

The specification of a seed for Gibbs sampling now leads to a call to set.seed and the external code used for fitting accesses the state of the R random number generator. The seed can also be set to NA (default) in order to not change the seed of the R random number generator when fitting the model.
The Gibbs sampling method for fitting the LDA model now also returns the current topic assignments for all words which allows to initialize Gibbs sampling either using the current term distribution of topics or these assignments.
The Gibbs sampling method for fitting the LDA model now allows to specify seed words, i.e., assign higher a-priori weights to some words for some topics.
The word assignment matrix contained in the fitted models now does not have any dimnames any more.
Package corpus.JSS.papers is now listed in the DESCRIPTION file together with the information that is available from the additional repository https://datacube.wu.ac.at.

topicmodels 0.1-12 (2013-08-20)

Package topicmodels now depends on package methods instead of importing it.

topicmodels 0.1-11 (2013-06-18)

Package SnowballC is now suggested instead of Snowball.

topicmodels 0.1-10

A check was added to ensure that no empty documents are in the data. Thanks to Terry Therneau for pointing the problem out.
The first argument in the functions printf_vector and printf_matrix defined in the C code for the CTM was corrected to be const char *. Thanks to Murray Stokely for providing the patch.

topicmodels 0.1-9 (2012-12-17)

A bug in function posterior was fixed where the rownames of the wrong object were used. Thanks to Benjamin S. Porter for pointing the problem out.
Dependency structure changed such that some packages are now only imported.
The information printed during the VEM algorithm when verbose is larger than 0 was improved.

topicmodels 0.1-8 (2012-11-07)

The code in the vignette for removing HTML markup was modified due to changes in package XML.

topicmodels 0.1-7 (2012-08-23)

A memory leak in the code of the fit function for LDA with method "VEM" was corrected. Thanks to Ramis Yamilov for pointing the problem out.

topicmodels 0.1-6 (2012-06-15)

The included dataset AssociatedPress had row names which were of type integer and not of type character. The object was re-saved omitting the row names.

topicmodels 0.1-5 (2012-04-17)

Vignettes moved from /inst/doc to /vignettes.
The source code for fitting the model using Gibbs sampling was modified because the code did not compile on Solaris. Thanks to Prof. Brian D. Ripley for pointing the problem out.
dtm2ldaformat() was modified to ensure that the resulting matrices for the documents contain integers. In addition dtm2ldaformat() and ldaformat2dtm() were changed to also work for document-term matrices containing empty documents and an argument was introduced to indicate if empty documents should be removed. Thanks to Eu Jin Lok for pointing the problems out.

topicmodels 0.1-4 (2011-12-27)

Missing 'Suggests' entries added in the DESCRIPTION file. Thanks to Prof. Brian D. Ripley for pointing the problem out.

topicmodels 0.1-3 (2011-10-23)

Name tags for Rd files changed to not contain slashes. Thanks to Prof. Brian D. Ripley for pointing the problem out as indicated in bug PR14707.

topicmodels 0.1-2 (2011-10-02)

A small bug fixed when saving interim results for fitting a LDA model using Gibbs sampling. Thanks to Nicholas Switanek for pointing the problem out.

topicmodels 0.1-1 (2011-09-06)

Makevars.win changed due to changes on CRAN for making libgsl for Windows. Thanks to Prof. Brian D. Ripley for pointing that out.

topicmodels 0.1-0 (2011-05-09)

The package vignette has been published in the Journal of Statistical Software, Volume 40, Issue 13 (doi:10.18637/jss.v040.i13), and the paper should be used as citation for the package, run citation("topicmodels") for details.

topicmodels 0.0-11 (2011-04-27)

C code changed to allow the package to compile on Solaris systems. Thanks to Prof. Brian D. Ripley for pointing the problems out and recommending suitable changes.

topicmodels 0.0-10 (2011-04-24)

C code changed to avoid warnings of unused variables.

topicmodels 0.0-9 (2011-04-15)

The slots for documents and terms names are not restricted to be of class "vector" any more to allow for document-term matrices where no row and/or column names are provided.

topicmodels 0.0-8 (2011-04-07)

A function perplexity() added for model validation and selection.
The input data for LDA() and CTM() can now either be a "DocumentTermMatrix" with term-frequency weighting or an object coercible to a "simple_triplet_matrix" with integer entries.
A bug in the C++ Gibbs sampling code fixed for the random number generation. Thanks to Uwe Ligges for pointing the problem out which he noted when checking the package for the Windows platform.
New control arguments added for keeping intermediate log-likelihood values during estimation and running repeated runs with random initilization. In addition the number of iterations made is now saved with the fitted model.
Functions ldaformat2dtm() and dtm2ldaformat() added to transform data from the lda package into a "DocumentTermMatrix" object and vice versa.
Bug fixed in rctm.c where for estimate.beta = FALSE one EM step was performed.

topicmodels 0.0-7 (2010-09-22)

The control for topic models now also has a seed argument to ensure reproducibility of results and a estimate.beta argument which can be used to fix the term distribution over topics after initialization.
The control for Gibbs sampling allows to specify to return repeated draws in a list using arguments burnin, thin and iter.
In slot beta for class "TopicModel" the log parameters are stored to have a higher accuracy for the VEM code if parameter values are close to zero.
Call to assert removed in C code to avoid termination of R.
Class "TopicModel" now has a slot loglikelihood. For models fitted using Gibbs sampling this contains the loglikelihood of the corpus, for VEM fitted models the vector of loglikelihoods for each document separately.

topicmodels 0.0-6 (2010-05-03)

Memory bug fixed in returnObjectGibbsLDA.
A slot save is added to the control objects to specify if the results and with which step size intermediate results are saved into files.

topicmodels 0.0-5 (2010-04-06)

Header files changed in utilities.cpp following an advice by Prof. Brian D. Ripley.

topicmodels 0.0-4 (2010-03-10)

Code for installing the package corpus.JSS.papers in the vignette improved.
dir.create() now called with showWarnings = FALSE.
Bug fixed in get_most_likely() for maximum possible k.
First version released on CRAN: 0.0-3.