com.raritantechnologies.concept.classifier
Class PhraseDocumentMatcher

java.lang.Object
  extended bycom.raritantechnologies.concept.classifier.BasicDocumentMatcher
      extended bycom.raritantechnologies.concept.classifier.PhraseDocumentMatcher
All Implemented Interfaces:
IConfigurable, IDocumentMatcher, ITermExtractor

public class PhraseDocumentMatcher
extends BasicDocumentMatcher
implements IDocumentMatcher

Determines if a document contains a phrase of one or more tokens.

XML Configuration Template:
  <DocumentMatcher class="com.raritantechnologies.concept.classifier.PhraseDocumentMatcher"
                      phrase="[ the phrase to match ]"
                      caseSensitive="[ true|false(default) ]"
                      allCapsAcronymFilter="[ true|false(default) ]" />
 

Developed by Raritan Technologies Inc..

Author:
Ted Sullivan

Constructor Summary
PhraseDocumentMatcher()
           
PhraseDocumentMatcher(java.lang.String phrase)
           
PhraseDocumentMatcher(java.lang.String phrase, boolean caseSensitive)
           
 
Method Summary
protected  void collectPhraseSet(java.util.HashSet phraseSet)
           
protected  void collectTermSet(java.util.HashSet termSet)
           
 void extractTerms(IndexedDocument fromDocument, java.util.HashMap termsMap)
          Extracts the matching terms contained in the document.
 void extractTerms(IndexedDocument fromDocument, java.util.Set termsSet)
           
 DocumentMatchBean getMatchCriteria(IndexedDocument document, java.util.Map termsMap)
          returns a DocumentMatchBean containing the match criteria (the category or categories that specify the 'reason' or context of the match.
 java.lang.String getPhrase()
           
 void initialize(org.w3c.dom.Element elem)
          Initializes the object from an XML tag or element.
 boolean isStopWord(IndexedDocument document)
          Adds stop word support.
 boolean matches(IndexedDocument document)
          returns true if the matcher matches the IndexedDocument, false otherwise.
 java.lang.String render()
          Renders a human-readable version of the matcher's logic.
 void setCaseSensitive(boolean caseSensitive)
           
 void setTerms(java.lang.String phrase)
           
 
Methods inherited from class com.raritantechnologies.concept.classifier.BasicDocumentMatcher
addAttribute, addTerms, addTermsAsAttributes, extractTerms, getAttribute, getAttributeNames, getMatchCriteria, getName, getPhraseSet, getTermSet, setName
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface com.raritantechnologies.concept.classifier.IDocumentMatcher
addAttribute, addTermsAsAttributes, getAttribute, getAttributeNames, getMatchCriteria, getName, getPhraseSet, getTermSet, setName
 
Methods inherited from interface com.raritantechnologies.utils.tagging.ITermExtractor
extractTerms
 

Constructor Detail

PhraseDocumentMatcher

public PhraseDocumentMatcher()

PhraseDocumentMatcher

public PhraseDocumentMatcher(java.lang.String phrase)

PhraseDocumentMatcher

public PhraseDocumentMatcher(java.lang.String phrase,
                             boolean caseSensitive)
Method Detail

matches

public boolean matches(IndexedDocument document)
Description copied from interface: IDocumentMatcher
returns true if the matcher matches the IndexedDocument, false otherwise.

Specified by:
matches in interface IDocumentMatcher
Specified by:
matches in class BasicDocumentMatcher

isStopWord

public boolean isStopWord(IndexedDocument document)
Description copied from interface: IDocumentMatcher
Adds stop word support. This is typically done by checking if the matchers terms are stop words by calling the IndexedDocument method isStopWord( string ). See TermDocumentMatcher.

Specified by:
isStopWord in interface IDocumentMatcher
Overrides:
isStopWord in class BasicDocumentMatcher

getMatchCriteria

public DocumentMatchBean getMatchCriteria(IndexedDocument document,
                                          java.util.Map termsMap)
Description copied from interface: IDocumentMatcher
returns a DocumentMatchBean containing the match criteria (the category or categories that specify the 'reason' or context of the match. Adds any contained terms or phrases to the termsMap

Specified by:
getMatchCriteria in interface IDocumentMatcher
Overrides:
getMatchCriteria in class BasicDocumentMatcher

extractTerms

public void extractTerms(IndexedDocument fromDocument,
                         java.util.HashMap termsMap)
Description copied from interface: IDocumentMatcher
Extracts the matching terms contained in the document.

Specified by:
extractTerms in interface IDocumentMatcher
Specified by:
extractTerms in class BasicDocumentMatcher

extractTerms

public void extractTerms(IndexedDocument fromDocument,
                         java.util.Set termsSet)
Specified by:
extractTerms in interface IDocumentMatcher

initialize

public void initialize(org.w3c.dom.Element elem)
Description copied from interface: IConfigurable
Initializes the object from an XML tag or element. This method is called by the Framework as part of the application initializtion. see ConfigurationManager, XMLConfigurationManager, XMLSearchFieldMapFactory, XMLSearchSourceFactory. Configurable objects that are owned or contained by other configurable objects will be initialized in by the parent object.

Specified by:
initialize in interface IConfigurable

setTerms

public void setTerms(java.lang.String phrase)

setCaseSensitive

public void setCaseSensitive(boolean caseSensitive)

collectTermSet

protected void collectTermSet(java.util.HashSet termSet)
Specified by:
collectTermSet in class BasicDocumentMatcher

collectPhraseSet

protected void collectPhraseSet(java.util.HashSet phraseSet)
Specified by:
collectPhraseSet in class BasicDocumentMatcher

getPhrase

public java.lang.String getPhrase()

render

public java.lang.String render()
Description copied from interface: IDocumentMatcher
Renders a human-readable version of the matcher's logic.

Specified by:
render in interface IDocumentMatcher