Class TokenInfoDictionaryBuilder
java.lang.Object
org.apache.lucene.analysis.ja.dict.TokenInfoDictionaryBuilder
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final String
private final DictionaryBuilder.DictionaryFormat
private final Normalizer.Form
private int
Internal word id - incrementally assigned as entries are read and added. -
Constructor Summary
ConstructorsConstructorDescriptionTokenInfoDictionaryBuilder
(DictionaryBuilder.DictionaryFormat format, String encoding, boolean normalizeEntries) -
Method Summary
Modifier and TypeMethodDescriptionprivate TokenInfoDictionaryWriter
buildDictionary
(List<Path> csvFiles) private String[]
formatEntry
(String[] features)
-
Field Details
-
encoding
-
normalForm
-
format
-
offset
private int offsetInternal word id - incrementally assigned as entries are read and added. This will be byte offset of dictionary file
-
-
Constructor Details
-
TokenInfoDictionaryBuilder
public TokenInfoDictionaryBuilder(DictionaryBuilder.DictionaryFormat format, String encoding, boolean normalizeEntries)
-
-
Method Details
-
build
- Throws:
IOException
-
buildDictionary
- Throws:
IOException
-
formatEntry
-