com.raritantechnologies.concept
Interface IDocumentKeywordProcessor

All Superinterfaces:
IConfigurable, IGatewayOutputProcessor, IResultSetProcessor
All Known Implementing Classes:
AbstractDocumentKeywordProcessor, KeywordFieldsDocumentKeywordProcessor, TermExtractorDocumentKeywordProcessor, WordCountDocKeywordProcessor

public interface IDocumentKeywordProcessor
extends IResultSetProcessor, IGatewayOutputProcessor

Creates a set of Keyword -> Document relationships. Extracts a set of keywords from documents. Can be used to generate a set of related documents based on keyword similarities as used with the RelatedDocumentProcessor.


Developed by Raritan Technologies .

Author:
Ted Sullivan

Method Summary
 void dataComplete()
          Data feed is complete.
 void dataComplete(boolean computeRelatedDocs)
           
 java.util.Map getDocuments()
          returns Map of document Key -> Document
 java.util.Map getKeywordAssociations(int minAssociationDistance, boolean returnRanked)
           
 java.util.Map getKeywords()
          returns a map of keyword text -> Keyword instance.
 OrderedMap getWordDocumentMap()
          Returns a map of keyword --> List of documents containing the keyword.
 OrderedMap getWordDocumentMap(WordCountComparator sortBy)
          Returns a map of keyword --> List of documents containing the keyword.
 void processResultSet(java.lang.String sessionID, IResultSet data)
          processes the IResultSet (somehow)
 void reset()
           
 
Methods inherited from interface com.raritantechnologies.searchApp.IResultSetProcessor
initialize, initialize
 
Methods inherited from interface com.raritantechnologies.searchApp.dataCollection.IGatewayOutputProcessor
getConfigurationXML, initialize, initialize, processData
 

Method Detail

reset

public void reset()

processResultSet

public void processResultSet(java.lang.String sessionID,
                             IResultSet data)
Description copied from interface: IResultSetProcessor
processes the IResultSet (somehow)

Specified by:
processResultSet in interface IResultSetProcessor

dataComplete

public void dataComplete()
Description copied from interface: IResultSetProcessor
Data feed is complete.

Specified by:
dataComplete in interface IResultSetProcessor

dataComplete

public void dataComplete(boolean computeRelatedDocs)

getWordDocumentMap

public OrderedMap getWordDocumentMap()
Returns a map of keyword --> List of documents containing the keyword.


getWordDocumentMap

public OrderedMap getWordDocumentMap(WordCountComparator sortBy)
Returns a map of keyword --> List of documents containing the keyword. OrderedMap can be ordered alphabetically (WordMapComparator.ALPHABETICAL) or by Word counts (word frequency ) (WordMapComparator.NUMBER_DOCS );


getDocuments

public java.util.Map getDocuments()
returns Map of document Key -> Document


getKeywords

public java.util.Map getKeywords()
returns a map of keyword text -> Keyword instance.


getKeywordAssociations

public java.util.Map getKeywordAssociations(int minAssociationDistance,
                                            boolean returnRanked)