com.raritantechnologies.concept
Class TermExtractorDocumentKeywordProcessor
java.lang.Object
com.raritantechnologies.concept.AbstractDocumentKeywordProcessor
com.raritantechnologies.concept.TermExtractorDocumentKeywordProcessor
- All Implemented Interfaces:
- IConfigurable, IDocumentKeywordProcessor, IGatewayOutputProcessor, IResultSetProcessor
- public class TermExtractorDocumentKeywordProcessor
- extends AbstractDocumentKeywordProcessor
- implements IDocumentKeywordProcessor
Uses an ITermExtractor to process documents for
keywords. Works with RelatedDocumentProcessor. Can get term Maps using local term extractors
or from System cache.
XML Configuration Template:
<DocumentProcessor class="com.raritantechnologies.concept.TermExtractorDocumentKeywordProcessor" >
<!-- One or more TermExtractor elements: -->
<TermExtractor class="[ class of com.raritantechnologies.utils.tagging.ITermExtractor ]"
resultKeyField="[ field where keywords will be stored ]"
textFields="[ comma separated field list of fields to process for keywords ]"
documentFields="[ list of document url fields ]" >
</TermExtractor>
<!-- etc . . . -->
</DocumentProcessor>
Developed by
Raritan Technologies .
- Author:
- Ted Sullivan
|
Method Summary |
java.lang.String |
getConfigurationXML()
|
protected void |
getWords(IResult result,
java.lang.String text,
java.lang.String resultKey)
Subclasses must implement this method: extract keywords from the text for the document
given by resultKey. |
void |
initialize(org.w3c.dom.Element elem)
Initialize the from XML Element. |
boolean |
isKeyword(WordCount wordCount)
|
| Methods inherited from class com.raritantechnologies.concept.AbstractDocumentKeywordProcessor |
addWord, addWord, dataComplete, dataComplete, getDocuments, getDocuments, getKeywordAssociations, getKeywords, getWordCounts, getWordDocumentMap, getWordDocumentMap, initialize, initialize, processData, processResult, processResultSet, reset |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
TermExtractorDocumentKeywordProcessor
public TermExtractorDocumentKeywordProcessor()
getWords
protected void getWords(IResult result,
java.lang.String text,
java.lang.String resultKey)
- Description copied from class:
AbstractDocumentKeywordProcessor
- Subclasses must implement this method: extract keywords from the text for the document
given by resultKey. The implemented method should call the addWord( ) method with each keyword
or word.
- Specified by:
getWords in class AbstractDocumentKeywordProcessor
isKeyword
public boolean isKeyword(WordCount wordCount)
- Specified by:
isKeyword in class AbstractDocumentKeywordProcessor
initialize
public void initialize(org.w3c.dom.Element elem)
- Description copied from interface:
IResultSetProcessor
- Initialize the from XML Element.
- Specified by:
initialize in interface IResultSetProcessor- Overrides:
initialize in class AbstractDocumentKeywordProcessor
getConfigurationXML
public java.lang.String getConfigurationXML()
- Specified by:
getConfigurationXML in interface IGatewayOutputProcessor