Package andyr.jtokeniser

Class Summary
BreakIteratorTokeniser The BreakIteratorTokeniser class uses a BreakIterator to find each word instance according to a specified locale.
RegexSeparatorTokeniser The RegexSeparatorTokeniser class uses regular expressions to define the separation between tokens.
RegexTokeniser The RegexTokeniser class uses regular expressions to define a word, and tokenises according to that expression.
SentenceTokeniser The SentenceTokeniser class uses a BreakIterator to find each word instance according to a specified locale.
StringTokeniser The StringTokeniser class uses a standard StringTokenizer to find each word in the input.
Tokeniser Tokeniser is an abstract base class for a variety of Tokenisers.
WhiteSpaceTokeniser The WhiteSpaceTokeniser class a basic tokeniser that uses whitespace to separate tokens from the input string.