org.gcube.indexmanagement.common.linguistics.jtextcat
Class JTextCatPlugin

java.lang.Object
  extended by org.gcube.indexmanagement.common.linguistics.jtextcat.JTextCatPlugin
All Implemented Interfaces:
LanguageIdPlugin

public class JTextCatPlugin
extends Object
implements LanguageIdPlugin

The class that provides methods for language identification.


Field Summary
 
Fields inherited from interface org.gcube.indexmanagement.common.linguistics.languageidplugin.LanguageIdPlugin
LANG_UNKNOWN
 
Constructor Summary
JTextCatPlugin()
          Empty constructor.
 
Method Summary
 String detectLanguage(String document)
          Detects the language in the document
 void init(String configFile)
          Method that initialises the implementation of the language id.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

JTextCatPlugin

public JTextCatPlugin()
Empty constructor.

Method Detail

init

public void init(String configFile)
          throws IndexException
Method that initialises the implementation of the language id.

Specified by:
init in interface LanguageIdPlugin
Parameters:
configFile - The config file needed by the identify language
Throws:
IndexException - when the language_identifier can not be created

detectLanguage

public String detectLanguage(String document)
                      throws IndexException
Detects the language in the document

Specified by:
detectLanguage in interface LanguageIdPlugin
Parameters:
document - The document
Returns:
The ISO string of the language. The ISO string can be converted by the language class to the "ISO enum" The string "nolang" is returned if no language can be identified. The string "not_enough_data" is returned if the document string is too short.
Throws:
IndexException - in case of a failure. or in case the an unknown language, or if the input document is too short to classify the language.


Copyright © 2012. All Rights Reserved.