semanticCoherenceSTS {sts}R Documentation

Semantic Coherence

Description

Calculates semantic coherence for an STS model.

Usage

semanticCoherenceSTS(beta, documents, vocab, M = 10)

Arguments

beta

the beta probability matrix (topic-word distributions) for a given document or alpha-level

documents

the documents over which to calculate coherence

vocab

the vocabulary corresponding to the terms in the beta matrix

M

the number of top words to consider per topic

Value

a numeric vector containing semantic coherence for each topic

Examples

#An example using the Gadarian data from the stm package.  From Raw text to 
# fitted model using textProcessor() which leverages the tm Package
library("tm"); library("stm"); library("sts")
temp<-textProcessor(documents=gadarian$open.ended.response,
metadata=gadarian, verbose = FALSE)
out <- prepDocuments(temp$documents, temp$vocab, temp$meta, verbose = FALSE)
X <- model.matrix(~1+out$meta$treatment + out$meta$pid_rep + 
out$meta$treatment * out$meta$pid_rep)[,-1]
X_seed <- as.matrix(out$meta$treatment)
## low max iteration number just for testing
sts_estimate <- sts(X, X_seed, out, numTopics = 3, verbose = FALSE, 
parallelize = FALSE, maxIter = 3, initialization = 'anchor')
full_beta_distn <- exp(sts_estimate$mv + sts_estimate$kappa$kappa_t + 
sts_estimate$kappa$kappa_s %*% diag(apply(sts_estimate$alpha[,3:5], 2, mean)))
full_beta_distn <- t(apply(full_beta_distn, 1, 
function(m) m / colSums(full_beta_distn)))
semanticCoherenceSTS(full_beta_distn, out$documents, out$vocab)

[Package sts version 1.0 Index]