org.gcube.indexmanagement.common.linguistics.lemmatizerplugin
Class SnowballStemmingPlugin

java.lang.Object
  extended by org.gcube.indexmanagement.common.linguistics.lemmatizerplugin.SnowballStemmingPlugin
All Implemented Interfaces:
LemmatizerPlugin

public class SnowballStemmingPlugin
extends Object
implements LemmatizerPlugin

The class that provides methods for language identification. The init method can be called once to initialize lemmatizers for the different languages. To add a new language after the init has been called, use the addLanguage method.


Constructor Summary
SnowballStemmingPlugin()
          Constructor, creates the lemmatizerMap.
 
Method Summary
 void add_language(Language language)
          Add a lemmatizer for the language
 void init(String configFile, Vector<Language> languages)
          Method that initialises the implementation of the language id
 String lemmatize_string(String document, Language language)
          Detects the language in the document
 String lemmatize_word(String word, Language language)
          Detects the language in the document
static void main(String[] args)
          Test main method to test the loading of the language id plugin and functions in the class
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SnowballStemmingPlugin

public SnowballStemmingPlugin()
Constructor, creates the lemmatizerMap.

Method Detail

init

public void init(String configFile,
                 Vector<Language> languages)
          throws IndexException
Method that initialises the implementation of the language id

Specified by:
init in interface LemmatizerPlugin
Parameters:
configFile - The config file needed by the identify language
languages - The languages that shall be supported by the lemmatizer
Throws:
IndexException - when the language_identifier can not be created If the init is called with a vector of languages, and one or several of the lemmatizers can not be created, the init method will throw an IndexException with the lanuages that failed like this: IndexExecption("en","fr","it")

lemmatize_word

public String lemmatize_word(String word,
                             Language language)
                      throws IndexException
Detects the language in the document

Specified by:
lemmatize_word in interface LemmatizerPlugin
Parameters:
word - The document
language - The language for the lemmatizer
Returns:
The lemmatized words, each form separted by ! a!b!c
Throws:
IndexException - in case of a failure

lemmatize_string

public String lemmatize_string(String document,
                               Language language)
                        throws IndexException
Detects the language in the document

Specified by:
lemmatize_string in interface LemmatizerPlugin
Parameters:
document - The document to lemmatize
language - The language for the lemmatizer
Returns:
The lemmatized document a!b!c#d!e!f#g
Throws:
IndexException - in case of a failure

add_language

public void add_language(Language language)
                  throws IndexException
Add a lemmatizer for the language

Specified by:
add_language in interface LemmatizerPlugin
Parameters:
language - The language for the lemmatizer
Throws:
IndexException - in case of a failure

main

public static void main(String[] args)
Test main method to test the loading of the language id plugin and functions in the class

Parameters:
args - - The main method input arguments


Copyright © 2013. All Rights Reserved.