|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.raritantechnologies.concept.AbstractDocumentKeywordProcessor
Base class for Document-Keyword processors. Implements basic functionality except for the detection of keywords. Subclasses should implement this by implementing the getWords( ) and isKeyword( ) methods. The getWords( ) implementation should detect keywords in the text, and add them to the processor by calling the superclass method addWord( word, docKey ).
XML Configuration Template:
<DocumentProcessor class="[ some subclass of com.raritantechnologies.concept.AbstractDocumentKeywordProcessor ]"
resultKeyField="[ field where keywords will be stored ]"
textFields="[ comma separated field list ]"
documentFields="[ list of document url fields ]" >
</DocumentProcessor>
| Field Summary | |
protected java.util.HashMap |
documents
|
protected java.lang.String |
resKeyField
|
| Constructor Summary | |
AbstractDocumentKeywordProcessor()
|
|
| Method Summary | |
protected void |
addWord(java.lang.String word,
java.lang.String docKey,
IResult result)
|
protected void |
addWord(java.lang.String word,
java.lang.String docKey,
IResult result,
int[] positions)
|
void |
dataComplete()
Data feed is complete. |
void |
dataComplete(boolean computeRelatedDocs)
|
java.util.Map |
getDocuments()
returns Map of document Key -> Document |
java.util.List |
getDocuments(WordCount wc)
|
java.util.Map |
getKeywordAssociations(int minAssociationDistance,
boolean returnRanked)
returns a sorted map of keyword --> java.util.HashMap of associated keywords The associated keywords map contains Keyword.AssociatedKeywordData objects defining the strength of the association and a set of sentences containing the keyword and the associated keyword. |
java.util.Map |
getKeywords()
returns Map of keyword text -> Keyword object. |
java.util.List |
getWordCounts(WordCountComparator sortBy)
|
OrderedMap |
getWordDocumentMap()
Returns an OrderedMap of keyword --> List of documents containing the keyword. |
OrderedMap |
getWordDocumentMap(WordCountComparator sortBy)
Returns a map of keyword --> List of documents containing the keyword. |
protected abstract void |
getWords(IResult result,
java.lang.String text,
java.lang.String docKey)
Subclasses must implement this method: extract keywords from the text for the document given by resultKey. |
void |
initialize(org.w3c.dom.Element elem)
Initialize the from XML Element. |
void |
initialize(org.w3c.dom.Element outputProcElem,
ISearchFieldMap sfMap)
Initialize the GatewayOutputProcessor from XML Configuration Element. |
void |
initialize(java.util.Map initParams)
Dynamic initialization. |
abstract boolean |
isKeyword(WordCount wordCount)
|
java.lang.String |
processData(IResultSet data)
returns name of XML File created/appended. |
protected void |
processResult(IResult result)
|
void |
processResultSet(java.lang.String sessionID,
IResultSet data)
processes the IResultSet (somehow) |
void |
reset()
|
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Methods inherited from interface com.raritantechnologies.searchApp.dataCollection.IGatewayOutputProcessor |
getConfigurationXML |
| Field Detail |
protected java.util.HashMap documents
protected java.lang.String resKeyField
| Constructor Detail |
public AbstractDocumentKeywordProcessor()
| Method Detail |
public void reset()
reset in interface IDocumentKeywordProcessorpublic java.lang.String processData(IResultSet data)
IGatewayOutputProcessor
processData in interface IGatewayOutputProcessor
public void processResultSet(java.lang.String sessionID,
IResultSet data)
IResultSetProcessor
processResultSet in interface IDocumentKeywordProcessorprotected void processResult(IResult result)
protected abstract void getWords(IResult result,
java.lang.String text,
java.lang.String docKey)
protected void addWord(java.lang.String word,
java.lang.String docKey,
IResult result)
protected void addWord(java.lang.String word,
java.lang.String docKey,
IResult result,
int[] positions)
public OrderedMap getWordDocumentMap()
getWordDocumentMap in interface IDocumentKeywordProcessorpublic OrderedMap getWordDocumentMap(WordCountComparator sortBy)
getWordDocumentMap in interface IDocumentKeywordProcessorpublic java.util.List getWordCounts(WordCountComparator sortBy)
public java.util.List getDocuments(WordCount wc)
public void dataComplete()
IResultSetProcessor
dataComplete in interface IDocumentKeywordProcessorpublic void dataComplete(boolean computeRelatedDocs)
dataComplete in interface IDocumentKeywordProcessorpublic abstract boolean isKeyword(WordCount wordCount)
public java.util.Map getDocuments()
Document
getDocuments in interface IDocumentKeywordProcessorpublic java.util.Map getKeywords()
Keyword object.
getKeywords in interface IDocumentKeywordProcessor
public java.util.Map getKeywordAssociations(int minAssociationDistance,
boolean returnRanked)
getKeywordAssociations in interface IDocumentKeywordProcessorpublic void initialize(java.util.Map initParams)
IResultSetProcessor
initialize in interface IResultSetProcessor
public void initialize(org.w3c.dom.Element outputProcElem,
ISearchFieldMap sfMap)
IGatewayOutputProcessor
initialize in interface IGatewayOutputProcessorpublic void initialize(org.w3c.dom.Element elem)
IResultSetProcessor
initialize in interface IResultSetProcessor
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||