sentopicmodel {sentopics} | R Documentation |
Create a sentopic model
Description
The set of functions LDA()
, JST()
, rJST()
and
sentopicmodel()
are all wrappers to an unified C++ routine and attempt to
replicate their corresponding model. This function is the lower level
wrapper to the C++ routine.
Usage
sentopicmodel(
x,
lexicon = NULL,
L1 = 5,
L2 = 3,
L1prior = 1,
L2prior = 5,
beta = 0.01,
L1cycle = 0,
L2cycle = 0,
reversed = TRUE
)
Arguments
x |
tokens object containing the texts. A coercion will be attempted if |
lexicon |
a |
L1 |
the number of labels in the first document mixture layer |
L2 |
the number of labels in the second document mixture layer |
L1prior |
the first layer hyperparameter of document mixtures |
L2prior |
the second layer hyperparameter of document mixtures |
beta |
the hyperparameter of vocabulary distribution |
L1cycle |
integer specifying the cycle size between two updates of the hyperparameter L1prior |
L2cycle |
integer specifying the cycle size between two updates of the hyperparameter L2prior |
reversed |
indicates on which dimension should |
Value
An S3 list containing the model parameter and the estimated mixture.
This object corresponds to a Gibbs sampler estimator with zero iterations.
The MCMC can be iterated using the fit()
function.
-
tokens
is the tokens object used to create the model -
vocabulary
contains the set of words of the corpus -
it
tracks the number of Gibbs sampling iterations -
za
is the list of topic assignment, aligned to thetokens
object with padding removed -
logLikelihood
returns the measured log-likelihood at each iteration, with a breakdown of the likelihood into hierarchical components as attribute
The topWords()
function easily extract the most probables words of each
topic/sentiment.
Author(s)
Olivier Delmarcelle
See Also
Fitting a model: fit()
,
extracting top words: topWords()
Other topic models:
JST()
,
LDA()
,
rJST()
Examples
LDA(ECB_press_conferences_tokens)
rJST(ECB_press_conferences_tokens, lexicon = LoughranMcDonald)