.mp_tokenize_word_lookup {morphemepiece}R Documentation

Tokenize a Word Including Lookup

Description

Look up a word in the table; go to fall-back otherwise.

Usage

.mp_tokenize_word_lookup(word, vocab, lookup, unk_token, max_chars)

Arguments

word

Character scalar; word to tokenize.

vocab

A morphemepiece vocabulary.

lookup

A morphemepiece lookup table.

unk_token

Token to represent unknown words.

max_chars

Maximum length of word recognized.

Value

Input word, broken into tokens.


[Package morphemepiece version 1.2.3 Index]