com.raritantechnologies.concept.classifier
Class TermDocumentMatcher

java.lang.Object
  extended bycom.raritantechnologies.concept.classifier.BasicDocumentMatcher
      extended bycom.raritantechnologies.concept.classifier.TermDocumentMatcher
All Implemented Interfaces:
IConfigurable, IDocumentMatcher, ITermExtractor

public class TermDocumentMatcher
extends BasicDocumentMatcher
implements IDocumentMatcher

Document matcher that determines if a particular term is present in the document.

XML Configuration Template:
  <DocumentMatcher class="com.raritantechnologies.concept.classifier.TermDocumentMatcher"
                      term="[ the term to match ]"
                      caseSensitive="[ true|false(default) ]"
                      allCapsAcronymFilter="[ true|false(default) ]"
                      stemming="[ true|false(default) ]"  />
 

Developed by Raritan Technologies Inc..

Author:
Ted Sullivan

Constructor Summary
TermDocumentMatcher()
           
TermDocumentMatcher(java.lang.String term)
           
TermDocumentMatcher(java.lang.String term, boolean caseSensitive)
           
TermDocumentMatcher(java.lang.String term, boolean caseSensitive, boolean useStemming)
           
TermDocumentMatcher(java.lang.String term, boolean caseSensitive, java.lang.Double maximumDocumentFrequency)
           
 
Method Summary
protected  void collectPhraseSet(java.util.HashSet phraseSet)
           
protected  void collectTermSet(java.util.HashSet termSet)
           
 void extractTerms(IndexedDocument fromDocument, java.util.HashMap termsMap)
          Extracts the matching terms contained in the document.
 void extractTerms(IndexedDocument fromDocument, java.util.Set termsSet)
           
 DocumentMatchBean getMatchCriteria(IndexedDocument document, java.util.Map termsMap)
          returns a DocumentMatchBean containing the match criteria (the category or categories that specify the 'reason' or context of the match.
 void initialize(org.w3c.dom.Element elem)
          Initializes the object from an XML tag or element.
 boolean isStopWord(IndexedDocument document)
          Adds stop word support.
 boolean matches(IndexedDocument document)
          returns true if the matcher matches the IndexedDocument, false otherwise.
 java.lang.String render()
          Renders a human-readable version of the matcher's logic.
 void setCaseSensitive(boolean caseSensitive)
           
 void setSubstringMatch(boolean substringMatch)
           
 void setTerm(java.lang.String term)
           
 void setUseStemming(boolean useStemming)
           
 
Methods inherited from class com.raritantechnologies.concept.classifier.BasicDocumentMatcher
addAttribute, addTerms, addTermsAsAttributes, extractTerms, getAttribute, getAttributeNames, getMatchCriteria, getName, getPhraseSet, getTermSet, setName
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface com.raritantechnologies.concept.classifier.IDocumentMatcher
addAttribute, addTermsAsAttributes, getAttribute, getAttributeNames, getMatchCriteria, getName, getPhraseSet, getTermSet, setName
 
Methods inherited from interface com.raritantechnologies.utils.tagging.ITermExtractor
extractTerms
 

Constructor Detail

TermDocumentMatcher

public TermDocumentMatcher()

TermDocumentMatcher

public TermDocumentMatcher(java.lang.String term)

TermDocumentMatcher

public TermDocumentMatcher(java.lang.String term,
                           boolean caseSensitive)

TermDocumentMatcher

public TermDocumentMatcher(java.lang.String term,
                           boolean caseSensitive,
                           boolean useStemming)

TermDocumentMatcher

public TermDocumentMatcher(java.lang.String term,
                           boolean caseSensitive,
                           java.lang.Double maximumDocumentFrequency)
Method Detail

getMatchCriteria

public DocumentMatchBean getMatchCriteria(IndexedDocument document,
                                          java.util.Map termsMap)
Description copied from interface: IDocumentMatcher
returns a DocumentMatchBean containing the match criteria (the category or categories that specify the 'reason' or context of the match. Adds any contained terms or phrases to the termsMap

Specified by:
getMatchCriteria in interface IDocumentMatcher
Overrides:
getMatchCriteria in class BasicDocumentMatcher

matches

public boolean matches(IndexedDocument document)
Description copied from interface: IDocumentMatcher
returns true if the matcher matches the IndexedDocument, false otherwise.

Specified by:
matches in interface IDocumentMatcher
Specified by:
matches in class BasicDocumentMatcher

isStopWord

public boolean isStopWord(IndexedDocument document)
Description copied from interface: IDocumentMatcher
Adds stop word support. This is typically done by checking if the matchers terms are stop words by calling the IndexedDocument method isStopWord( string ). See TermDocumentMatcher.

Specified by:
isStopWord in interface IDocumentMatcher
Overrides:
isStopWord in class BasicDocumentMatcher

extractTerms

public void extractTerms(IndexedDocument fromDocument,
                         java.util.HashMap termsMap)
Description copied from interface: IDocumentMatcher
Extracts the matching terms contained in the document.

Specified by:
extractTerms in interface IDocumentMatcher
Specified by:
extractTerms in class BasicDocumentMatcher

extractTerms

public void extractTerms(IndexedDocument fromDocument,
                         java.util.Set termsSet)
Specified by:
extractTerms in interface IDocumentMatcher

initialize

public void initialize(org.w3c.dom.Element elem)
Description copied from interface: IConfigurable
Initializes the object from an XML tag or element. This method is called by the Framework as part of the application initializtion. see ConfigurationManager, XMLConfigurationManager, XMLSearchFieldMapFactory, XMLSearchSourceFactory. Configurable objects that are owned or contained by other configurable objects will be initialized in by the parent object.

Specified by:
initialize in interface IConfigurable

setTerm

public void setTerm(java.lang.String term)

setCaseSensitive

public void setCaseSensitive(boolean caseSensitive)

setSubstringMatch

public void setSubstringMatch(boolean substringMatch)

collectTermSet

protected void collectTermSet(java.util.HashSet termSet)
Specified by:
collectTermSet in class BasicDocumentMatcher

collectPhraseSet

protected void collectPhraseSet(java.util.HashSet phraseSet)
Specified by:
collectPhraseSet in class BasicDocumentMatcher

render

public java.lang.String render()
Description copied from interface: IDocumentMatcher
Renders a human-readable version of the matcher's logic.

Specified by:
render in interface IDocumentMatcher

setUseStemming

public void setUseStemming(boolean useStemming)