Modifier and Type | Class and Description |
---|---|
static class |
TokenizerBase.Builder
Abstract Builder shared by all tokenizers
|
static class |
TokenizerBase.Mode |
Modifier and Type | Field and Description |
---|---|
protected EnumMap<ViterbiNode.Type,Dictionary> |
dictionaryMap |
protected TokenFactory |
tokenFactory |
Constructor and Description |
---|
TokenizerBase() |
Modifier and Type | Method and Description |
---|---|
protected void |
configure(TokenizerBase.Builder builder) |
protected <T extends TokenBase> |
createTokenList(String text)
Tokenizes the provided text and returns a list of tokens with various feature information
This method is thread safe
|
void |
debugLattice(OutputStream outputStream,
String text)
Writes the Viterbi lattice for the provided text to an output stream
The output is written in DOT format.
|
void |
debugTokenize(OutputStream outputStream,
String text)
Tokenizes the provided text and outputs the corresponding Viterbi lattice and the Viterbi path to the provided output stream
The output is written in DOT format.
|
List<? extends TokenBase> |
tokenize(String text) |
protected TokenFactory tokenFactory
protected EnumMap<ViterbiNode.Type,Dictionary> dictionaryMap
protected void configure(TokenizerBase.Builder builder)
protected <T extends TokenBase> List<T> createTokenList(String text)
This method is thread safe
T
- token typetext
- text to tokenizepublic void debugTokenize(OutputStream outputStream, String text) throws IOException
The output is written in DOT format.
This method is not thread safe
outputStream
- output stream to write totext
- text to tokenizeIOException
- if an error occurs when writing the lattice and pathpublic void debugLattice(OutputStream outputStream, String text) throws IOException
The output is written in DOT format.
This method is not thread safe
outputStream
- output stream to write totext
- text to create lattice forIOException
- if an error occurs when writing the latticeCopyright © 2020. All rights reserved.