dtm_sample {udpipe} | R Documentation |
Sample the specified number of rows from the Document-Term-Matrix using either with or without replacement.
dtm_sample(dtm, size = nrow(dtm), replace = FALSE, prob = NULL)
dtm |
a document term matrix of class dgCMatrix (which can be an object returned by |
size |
a positive number, the number of rows to sample |
replace |
should sampling be with replacement |
prob |
a vector of probability weights, one for each row of |
dtm
with as many rows as specified in size
x <- list(doc1 = c("aa", "bb", "cc", "aa", "b"),
doc2 = c("bb", "bb", "dd", ""),
doc3 = character(),
doc4 = c("cc", NA),
doc5 = character())
dtm <- document_term_matrix(x)
dtm_sample(dtm, size = 2)
dtm_sample(dtm, size = 3)
dtm_sample(dtm, size = 2)
dtm_sample(dtm, size = 8, replace = TRUE)
dtm_sample(dtm, size = 8, replace = TRUE, prob = c(1, 1, 0.01, 0.5, 0.01))