com.raritantechnologies.concept
Class RelatedDocumentProcessor

java.lang.Object
  extended bycom.raritantechnologies.concept.RelatedDocumentProcessor
All Implemented Interfaces:
IConfigurable, IGatewayOutputProcessor, IResultSetProcessor

public class RelatedDocumentProcessor
extends java.lang.Object
implements IGatewayOutputProcessor, IResultSetProcessor

Uses clustering to compute related documents based on sets of common keywords. Uses a nested IDocumentKeywordProcessor to handle document keyword extraction. Used to generate a list of similar documents.

Creates an output IResultSet with one result per document from the input result sets. Each output result contains a nested result set with the set of related documents.

XML Configuration Template:
  <OutputProcessor class="com.raritantechnologies.concept.RelatedDocumentProcessor"
                      resultKeyField="[ field id for document key value ]"
                      relatedDocKeyField="[ nested result field name for related docs ]"
                      relatedDocFields="[ list of fields that should be copied to related Document ]" >

    <DocumentProcessor class="[ class of com.raritantechnologies.concept.IDocumentKeywordProcessor ]" >

    </DocumentProcessor>

    <!-- Gateway Output Processor to handle related document results -->
    <OutputProcessor class="[ class of com.raritantechnologies.searchApp.dataCollection.IGatewayOutputProcessor ]" >

    </OutputProcessor>

  </OutputProcessor>
 

Developed by Raritan Technologies .

Author:
Ted Sullivan

Constructor Summary
RelatedDocumentProcessor()
           
 
Method Summary
 void dataComplete()
          Data feed is complete.
 java.lang.String getConfigurationXML()
           
 java.lang.String getRelatedDocumentSetKey()
           
 java.lang.String getResultKeyField()
           
 OrderedMap getWordDocumentMap()
          Returns a map of keyword and Documents that contain the keyword.
 OrderedMap getWordDocumentMap(WordCountComparator sortBy)
          Returns a map of keyword and Documents that contain the keyword.
 void initialize(org.w3c.dom.Element outputProcElem)
          Initialize the GatewayOutputProcessor from XML Element.
 void initialize(org.w3c.dom.Element outputProcElem, ISearchFieldMap sfMap)
          Initialize the GatewayOutputProcessor from XML Configuration Element.
 void initialize(java.util.Map initParams)
          Used for dynamic initialization (connection, collection name, file name, etc.)
 java.lang.String processData(IResultSet data)
          returns name of XML File created/appended.
 void processResultSet(java.lang.String sessionID, IResultSet data)
          processes the IResultSet (somehow)
 void setOutputProcessor(IGatewayOutputProcessor outputProc)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

RelatedDocumentProcessor

public RelatedDocumentProcessor()
Method Detail

processData

public java.lang.String processData(IResultSet data)
Description copied from interface: IGatewayOutputProcessor
returns name of XML File created/appended.

Specified by:
processData in interface IGatewayOutputProcessor

processResultSet

public void processResultSet(java.lang.String sessionID,
                             IResultSet data)
Description copied from interface: IResultSetProcessor
processes the IResultSet (somehow)

Specified by:
processResultSet in interface IResultSetProcessor

dataComplete

public void dataComplete()
Data feed is complete.

Specified by:
dataComplete in interface IGatewayOutputProcessor

getWordDocumentMap

public OrderedMap getWordDocumentMap()
Returns a map of keyword and Documents that contain the keyword.


getWordDocumentMap

public OrderedMap getWordDocumentMap(WordCountComparator sortBy)
Returns a map of keyword and Documents that contain the keyword. sortBy can be WordCountComparator.ALPHABETICAL or WordCountComparator.NUMBER_DOCS


initialize

public void initialize(java.util.Map initParams)
Description copied from interface: IGatewayOutputProcessor
Used for dynamic initialization (connection, collection name, file name, etc.)

Specified by:
initialize in interface IGatewayOutputProcessor

initialize

public void initialize(org.w3c.dom.Element outputProcElem)
Initialize the GatewayOutputProcessor from XML Element.

Specified by:
initialize in interface IResultSetProcessor

initialize

public void initialize(org.w3c.dom.Element outputProcElem,
                       ISearchFieldMap sfMap)
Description copied from interface: IGatewayOutputProcessor
Initialize the GatewayOutputProcessor from XML Configuration Element.

Specified by:
initialize in interface IGatewayOutputProcessor

getConfigurationXML

public java.lang.String getConfigurationXML()
Specified by:
getConfigurationXML in interface IGatewayOutputProcessor

setOutputProcessor

public void setOutputProcessor(IGatewayOutputProcessor outputProc)

getRelatedDocumentSetKey

public java.lang.String getRelatedDocumentSetKey()

getResultKeyField

public java.lang.String getResultKeyField()