tfidf {rscc} | R Documentation |
Computes the term frequency–inverse document frequency uses tha cosine of the angles between the documents as similarity measure. Since R source code is provided no stemming or stop words are applied.
tfidf(docs)
docs |
document object |
similarity matrix
files <- list.files(system.file("examples", package="rscc"), "*.R$", full.names = TRUE)
prgs <- sourcecode(files, basename=TRUE, silent=TRUE)
docs <- documents(prgs)
tfidf(docs)
# further steps
# m <- tfidf(docs)
# df <- matrix2dataframe(m)
# head(df, n=20)
# browse(prgs, df, n=5)