|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.raritantechnologies.concept.classifier.BasicDocumentMatcher
com.raritantechnologies.concept.classifier.NearDocumentMatcher
Document Matcher that performs a proximity analysis on two or more child matchers.
XML Configuration Template:
<DocumentMatcher class="com.raritantechnologies.concept.classifier.NearDocumentMatcher"
terms="[ can specify terms as comma separated list ]"
minimumDistance="[ minimum number of terms separating first and last terms (default=10) ]"
caseSensitive="[ true|false(default) ]"
isOrdered="[ true|false(default) ]" >
<!-- Alternatively, can specify two or more child IDocumentMatchers that will contribute to the proximity match -->
<DocumentMatcher class="[ class of com.raritantechnologies.concept.classifier.IDocumentMatcher ]" >
</DocumentMatcher>
<!-- etc. . . -->
</DocumentMatcher>
| Constructor Summary | |
NearDocumentMatcher()
|
|
NearDocumentMatcher(IDocumentMatcher[] matchers,
boolean casesensitive,
int minDistance)
|
|
NearDocumentMatcher(IDocumentMatcher termOne,
IDocumentMatcher termTwo,
boolean caseSensitive,
int minDistance)
|
|
NearDocumentMatcher(java.lang.String termOne,
java.lang.String termTwo,
boolean caseSensitive,
int minDistance)
|
|
| Method Summary | |
protected void |
collectPhraseSet(java.util.HashSet phraseSet)
|
protected void |
collectTermSet(java.util.HashSet termSet)
|
void |
extractTerms(IndexedDocument fromDocument,
java.util.HashMap termsMap)
Extracts the matching terms contained in the document. |
void |
extractTerms(IndexedDocument fromDocument,
java.util.Set termsSet)
|
DocumentMatchBean |
getMatchCriteria(IndexedDocument document,
java.util.Map termsMap)
returns a DocumentMatchBean containing the match criteria (the category or categories that specify the 'reason' or context of the match. |
void |
initialize(org.w3c.dom.Element elem)
Initializes the object from an XML tag or element. |
boolean |
isStopWord(IndexedDocument document)
Adds stop word support. |
boolean |
matches(IndexedDocument document)
returns true if the matcher matches the IndexedDocument, false otherwise. |
java.lang.String |
render()
Renders a human-readable version of the matcher's logic. |
void |
setIsOrdered(boolean isOrdered)
|
| Methods inherited from class com.raritantechnologies.concept.classifier.BasicDocumentMatcher |
addAttribute, addTerms, addTermsAsAttributes, extractTerms, getAttribute, getAttributeNames, getMatchCriteria, getName, getPhraseSet, getTermSet, setName |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Methods inherited from interface com.raritantechnologies.concept.classifier.IDocumentMatcher |
addAttribute, addTermsAsAttributes, getAttribute, getAttributeNames, getMatchCriteria, getName, getPhraseSet, getTermSet, setName |
| Methods inherited from interface com.raritantechnologies.utils.tagging.ITermExtractor |
extractTerms |
| Constructor Detail |
public NearDocumentMatcher()
public NearDocumentMatcher(java.lang.String termOne,
java.lang.String termTwo,
boolean caseSensitive,
int minDistance)
public NearDocumentMatcher(IDocumentMatcher termOne,
IDocumentMatcher termTwo,
boolean caseSensitive,
int minDistance)
public NearDocumentMatcher(IDocumentMatcher[] matchers,
boolean casesensitive,
int minDistance)
| Method Detail |
public boolean matches(IndexedDocument document)
IDocumentMatcher
matches in interface IDocumentMatchermatches in class BasicDocumentMatcherpublic boolean isStopWord(IndexedDocument document)
IDocumentMatcherTermDocumentMatcher.
isStopWord in interface IDocumentMatcherisStopWord in class BasicDocumentMatcher
public DocumentMatchBean getMatchCriteria(IndexedDocument document,
java.util.Map termsMap)
IDocumentMatcher
getMatchCriteria in interface IDocumentMatchergetMatchCriteria in class BasicDocumentMatcher
public void extractTerms(IndexedDocument fromDocument,
java.util.HashMap termsMap)
IDocumentMatcher
extractTerms in interface IDocumentMatcherextractTerms in class BasicDocumentMatcher
public void extractTerms(IndexedDocument fromDocument,
java.util.Set termsSet)
extractTerms in interface IDocumentMatcherpublic void initialize(org.w3c.dom.Element elem)
IConfigurable
initialize in interface IConfigurableprotected void collectTermSet(java.util.HashSet termSet)
collectTermSet in class BasicDocumentMatcherprotected void collectPhraseSet(java.util.HashSet phraseSet)
collectPhraseSet in class BasicDocumentMatcherpublic void setIsOrdered(boolean isOrdered)
public java.lang.String render()
IDocumentMatcher
render in interface IDocumentMatcher
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||