Package | Description |
---|---|
com.atilika.kuromoji | |
com.atilika.kuromoji.ipadic | |
com.atilika.kuromoji.viterbi |
Modifier and Type | Field and Description |
---|---|
protected TokenizerBase.Mode |
TokenizerBase.Builder.mode |
Modifier and Type | Method and Description |
---|---|
static TokenizerBase.Mode |
TokenizerBase.Mode.valueOf(String name)
Returns the enum constant of this type with the specified name.
|
static TokenizerBase.Mode[] |
TokenizerBase.Mode.values()
Returns an array containing the constants of this enum type, in
the order they are declared.
|
Modifier and Type | Method and Description |
---|---|
Tokenizer.Builder |
Tokenizer.Builder.mode(TokenizerBase.Mode mode)
Sets the tokenization mode
The tokenization mode defines how Available modes are as follows:
NORMAL - The default mode
SEARCH - Uses a heuristic to segment compound nouns (č¤ĺĺčŠ) into their parts
EXTENDED - Same as SEARCH, but emits unigram tokens for unknown terms
See Tokenizer.Builder.kanjiPenalty and Tokenizer.Builder.otherPenalty for how to adjust costs used by SEARCH and EXTENDED mode |
Constructor and Description |
---|
ViterbiBuilder(DoubleArrayTrie trie,
TokenInfoDictionary dictionary,
UnknownDictionary unknownDictionary,
UserDictionary userDictionary,
TokenizerBase.Mode mode)
Constructor
|
ViterbiSearcher(TokenizerBase.Mode mode,
ConnectionCosts costs,
UnknownDictionary unknownDictionary,
List<Integer> penalties) |
Copyright © 2020. All rights reserved.