gr.uoa.di.madgik.searchlibrary.operatorlibrary.duplicateeliminatoroperator
Class DistinctOp

java.lang.Object
  extended by java.lang.Thread
      extended by gr.uoa.di.madgik.searchlibrary.operatorlibrary.duplicateeliminatoroperator.DistinctOp
All Implemented Interfaces:
Runnable

public class DistinctOp
extends Thread

This thread performs the actual duplicate elimination

Author:
UoA

Nested Class Summary
 
Nested classes/interfaces inherited from class java.lang.Thread
Thread.State, Thread.UncaughtExceptionHandler
 
Field Summary
static int BufferCapacityDef
          The default buffer capacity for the IRecordWriter and, for the IRecordReader, if applicable
static int EliminationRatioComputationStepDef
          A default value for the step of the elimination ratio recomputation
static boolean KeepMaximumRankDef
          The default rank processing policy
static int SafeNumberOfResultsDef
          A default value for the number of results which is considered safe for a reliable duplicate elimination ratio.
static long TimeoutDef
          The default timeout used by the IRecordWriter and all IRecordReaders.
static TimeUnit TimeUnitDef
          The default timeout unit used by the IRecordWriter and all IRecordReaders.
 
Fields inherited from class java.lang.Thread
MAX_PRIORITY, MIN_PRIORITY, NORM_PRIORITY
 
Method Summary
static URI dispatchNewWorker(URI loc, String objectIdFieldName, StatsContainer stats)
           
static URI dispatchNewWorker(URI loc, String objectIdFieldName, String objectRankFieldName, boolean keepMaximumRank, long timeout, TimeUnit timeUnit, int bufferCapacity, StatsContainer stats)
          Static constructor
static URI dispatchNewWorker(URI loc, String objectIdFieldName, String objectRankFieldName, boolean keepMaximumRank, long timeout, TimeUnit timeUnit, StatsContainer stats)
          Static constructor
static URI dispatchNewWorker(URI loc, String objectIdFieldName, String objectRankFieldName, boolean keepMaximumRank, StatsContainer stats)
           
 void run()
          Main working cycle:
Get a hashtable of distinct DocIDs (plus their ranks) Re-iterate the RS keeping only the RSs that have a matching DocID and Rank with the stored hashtable
 
Methods inherited from class java.lang.Thread
activeCount, checkAccess, countStackFrames, currentThread, destroy, dumpStack, enumerate, getAllStackTraces, getContextClassLoader, getDefaultUncaughtExceptionHandler, getId, getName, getPriority, getStackTrace, getState, getThreadGroup, getUncaughtExceptionHandler, holdsLock, interrupt, interrupted, isAlive, isDaemon, isInterrupted, join, join, join, resume, setContextClassLoader, setDaemon, setDefaultUncaughtExceptionHandler, setName, setPriority, setUncaughtExceptionHandler, sleep, sleep, start, stop, stop, suspend, toString, yield
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

TimeoutDef

public static final long TimeoutDef
The default timeout used by the IRecordWriter and all IRecordReaders. Currently set to 60.

See Also:
Constant Field Values

TimeUnitDef

public static final TimeUnit TimeUnitDef
The default timeout unit used by the IRecordWriter and all IRecordReaders. The current default unit is seconds.


BufferCapacityDef

public static final int BufferCapacityDef
The default buffer capacity for the IRecordWriter and, for the IRecordReader, if applicable

See Also:
Constant Field Values

KeepMaximumRankDef

public static final boolean KeepMaximumRankDef
The default rank processing policy

See Also:
Constant Field Values

SafeNumberOfResultsDef

public static final int SafeNumberOfResultsDef
A default value for the number of results which is considered safe for a reliable duplicate elimination ratio.

See Also:
Constant Field Values

EliminationRatioComputationStepDef

public static final int EliminationRatioComputationStepDef
A default value for the step of the elimination ratio recomputation

See Also:
Constant Field Values
Method Detail

dispatchNewWorker

public static URI dispatchNewWorker(URI loc,
                                    String objectIdFieldName,
                                    String objectRankFieldName,
                                    boolean keepMaximumRank,
                                    long timeout,
                                    TimeUnit timeUnit,
                                    int bufferCapacity,
                                    StatsContainer stats)
                             throws Exception
Static constructor

Parameters:
loc - Locator of the incoming RS
stats -
keepMaximumRank -
timeout -
timeUnit -
bufferCapacity -
Returns:
A gRS2 locator
Throws:
Exception - in case of error

dispatchNewWorker

public static URI dispatchNewWorker(URI loc,
                                    String objectIdFieldName,
                                    String objectRankFieldName,
                                    boolean keepMaximumRank,
                                    long timeout,
                                    TimeUnit timeUnit,
                                    StatsContainer stats)
                             throws Exception
Static constructor

Parameters:
loc - Locator of the incoming RS
stats -
keepMaximumRank -
timeout -
timeUnit -
Returns:
A gRS2 locator
Throws:
Exception - in case of error

dispatchNewWorker

public static URI dispatchNewWorker(URI loc,
                                    String objectIdFieldName,
                                    StatsContainer stats)
                             throws Exception
Parameters:
loc -
stats -
keepMaximumRank -
Returns:
Throws:
Exception

dispatchNewWorker

public static URI dispatchNewWorker(URI loc,
                                    String objectIdFieldName,
                                    String objectRankFieldName,
                                    boolean keepMaximumRank,
                                    StatsContainer stats)
                             throws Exception
Parameters:
loc -
objectIdFieldName -
objectRankFieldName -
keepMaximumRank -
stats -
Returns:
Throws:
Exception

run

public void run()
Main working cycle:
  1. Get a hashtable of distinct DocIDs (plus their ranks)
  2. Re-iterate the RS keeping only the RSs that have a matching DocID and Rank with the stored hashtable

Specified by:
run in interface Runnable
Overrides:
run in class Thread


Copyright © 2013. All Rights Reserved.