rslp_doc {rslp} | R Documentation |
Apply the Stemming Algorithm for the Portuguese Language to vector of documents. It extracts words using the regex "\b[:alpha:]\b"
rslp_doc(
docs,
steprules = readRDS(system.file("steprules.rds", package = "rslp"))
)
docs |
chr vector of documents |
steprules |
as obtained from the function extract_rules. (only define if you are certain about it). The default is to get the parsed version of the rules installed with the package. |
V. Orengo, C. Huyck, "A Stemming Algorithmm for the Portuguese Language", SPIRE, 2001, String Processing and Information Retrieval, International Symposium on, String Processing and Information Retrieval, International Symposium on 2001, pp. 0186, doi:10.1109/SPIRE.2001.10024
docs <- c("coma frutas pois elas fazem bem para.")
rslp_doc(docs)