org.gcube.indexmanagement.common.linguistics.lemmatizerplugin
Interface LemmatizerPlugin

All Known Implementing Classes:
DummyLemmatizerPlugin, SnowballStemmingPlugin

public interface LemmatizerPlugin

The interface that all lemmatizer plugins implement.


Method Summary
 void add_language(Language language)
          Method to initialise this plugin, the configuration file is optional, but most language detection tools need some sort of statistics, references and so on.
 void init(String configFile, Vector<Language> languages)
          Method to initialise this plugin, the configuration file is optional.
 String lemmatize_string(String stringToLemmatize, Language language)
          Method to lemmatize a word
 String lemmatize_word(String wordToLemmatize, Language language)
          Method to lemmatize a word
 

Method Detail

init

void init(String configFile,
          Vector<Language> languages)
          throws IndexException
Method to initialise this plugin, the configuration file is optional.

Parameters:
configFile - The config file
languages - The vector of languages that shall be supported
Throws:
IndexException - In case of failure

lemmatize_word

String lemmatize_word(String wordToLemmatize,
                      Language language)
                      throws IndexException
Method to lemmatize a word

Parameters:
wordToLemmatize - The word to lemmatize.
language - The language to use when getting the lemmatized forms.
Returns:
The string with the lemmatized forms of the word. Each form is separated by ! The string may look like : house!houses
Throws:
IndexException - In case of failure

lemmatize_string

String lemmatize_string(String stringToLemmatize,
                        Language language)
                        throws IndexException
Method to lemmatize a word

Parameters:
stringToLemmatize - The string to lemmatize. The string can be a single word, or several words separated by spaces.
language - The language to use when getting the lemmatized forms.
Returns:
The string with the lemmatized forms of the word. Each form for the same word is separated by !, and the # is used as a separater between words The string may look like : house!houses#knee!knees#
Throws:
IndexException - In case of failure

add_language

void add_language(Language language)
                  throws IndexException
Method to initialise this plugin, the configuration file is optional, but most language detection tools need some sort of statistics, references and so on.

Parameters:
language - The language to add to the lemmatizer
Throws:
IndexException - In case of failure


Copyright © 2013. All Rights Reserved.