org.exist.storage.analysis
Class SimpleTokenizer
java.lang.Object
org.exist.storage.analysis.SimpleTokenizer
- All Implemented Interfaces:
- Tokenizer
- public class SimpleTokenizer
- extends java.lang.Object
- implements Tokenizer
This is the default class used by the fulltext indexer for
tokenizing a string into words. Known token types are defined
by class Token.
- Author:
- Wolfgang Meier
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
SimpleTokenizer
public SimpleTokenizer()
SimpleTokenizer
public SimpleTokenizer(boolean stem)
setStemming
public void setStemming(boolean stem)
- Specified by:
setStemming
in interface Tokenizer
alpha
protected TextToken alpha(TextToken token,
boolean allowWildcards)
alphanum
protected TextToken alphanum(TextToken token,
boolean allowWildcards)
consume
protected void consume()
eof
protected TextToken eof()
getLength
public int getLength()
getText
public java.lang.String getText()
nextTerminalToken
protected TextToken nextTerminalToken(boolean wildcards)
nextToken
public TextToken nextToken()
- Specified by:
nextToken
in interface Tokenizer
nextToken
public TextToken nextToken(boolean wildcards)
- Specified by:
nextToken
in interface Tokenizer
number
protected TextToken number()
p
protected TextToken p()
setText
public void setText(java.lang.CharSequence text)
- Specified by:
setText
in interface Tokenizer
whitespace
protected TextToken whitespace()
Copyright (C) Wolfgang Meier. All rights reserved.