token_morph {RmecabKo} | R Documentation |
These tokernizer functions perform tokenization into full or selected morphemes, nouns.
token_morph(phrase, strip_punct = FALSE, strip_numeric = FALSE)
token_words(phrase, strip_punct = FALSE, strip_numeric = FALSE)
token_nouns(phrase, strip_punct = FALSE, strip_numeric = FALSE)
phrase |
A character vector or a list of character vectors to be tokenized into morphemes.
If |
strip_punct |
Bool. If you want to remove punctuations in the phrase, set this as TRUE. |
strip_numeric |
Bool. If you want to remove numbers in the phrase, set this as TRUE. |
A list of character vectors containing the tokens, with one element in the list.
See examples in Github.
## Not run:
txt <- # Some Korean sentence
token_morph(txt)
token_words(txt, strip_punct = FALSE)
token_nouns(txt, strip_numeric = TRUE)
## End(Not run)