com.raritantechnologies.concept
Class WordCountDocKeywordProcessor
java.lang.Object
com.raritantechnologies.concept.AbstractDocumentKeywordProcessor
com.raritantechnologies.concept.WordCountDocKeywordProcessor
- All Implemented Interfaces:
- IConfigurable, IDocumentKeywordProcessor, IGatewayOutputProcessor, IResultSetProcessor
- public class WordCountDocKeywordProcessor
- extends AbstractDocumentKeywordProcessor
- implements IDocumentKeywordProcessor
This type of processor calculates relative word frequencies - a Keyword is defined as a word
that occurs in some range of document percentages.
XML Configuration Template:
<DocumentProcessor class="com.raritantechnologies.concept.WordCountDocKeywordProcessor"
resultKeyField="[ field where keywords will be stored ]"
textFields="[ comma separated field list ]"
documentFields="[ list of document url fields ]"
minWordFrequency="[ percentage value for minimum word frequency considered as keyword ]"
maxWordFrequency="[ percentage value for maximum word frequency considered as keyword ]" >
</DocumentProcessor>
Developed by
Raritan Technologies .
- Author:
- Ted Sullivan
| Methods inherited from class com.raritantechnologies.concept.AbstractDocumentKeywordProcessor |
addWord, addWord, dataComplete, dataComplete, getDocuments, getDocuments, getKeywordAssociations, getKeywords, getWordCounts, getWordDocumentMap, getWordDocumentMap, initialize, initialize, processData, processResult, processResultSet, reset |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
WordCountDocKeywordProcessor
public WordCountDocKeywordProcessor()
getWords
protected void getWords(IResult result,
java.lang.String text,
java.lang.String resultKey)
- Description copied from class:
AbstractDocumentKeywordProcessor
- Subclasses must implement this method: extract keywords from the text for the document
given by resultKey. The implemented method should call the addWord( ) method with each keyword
or word.
- Specified by:
getWords in class AbstractDocumentKeywordProcessor
isKeyword
public boolean isKeyword(WordCount wordCount)
- Specified by:
isKeyword in class AbstractDocumentKeywordProcessor
initialize
public void initialize(org.w3c.dom.Element elem)
- Description copied from interface:
IResultSetProcessor
- Initialize the from XML Element.
- Specified by:
initialize in interface IResultSetProcessor- Overrides:
initialize in class AbstractDocumentKeywordProcessor
setLowRange
public void setLowRange(double lowRange)
getLowRange
public double getLowRange()
setHighRange
public void setHighRange(double highRange)
getHighRange
public double getHighRange()
getConfigurationXML
public java.lang.String getConfigurationXML()
- Specified by:
getConfigurationXML in interface IGatewayOutputProcessor