|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.raritantechnologies.searchApp.formatters.KeywordExtractor
Extracts a set of words or phrases from a field or fields that are contained within a set of "match" values. Adds
the extracted terms to an IResult output field.
Can also be used to extract the set of keywords matched by one or more IDocumentMatchers
from a prior DocumentClassifier formatting operation.
For example, can create a "keywords" field by extracting keywords from other fields in the result.
The list of valid phrases can be statically defined in the configuration file,
acquired from using a ITermExtractor - an Entity Extractor,
a data file, or looked up from a SearchSource.
An optional IStringFilter can be added to modify each
extracted term in cases where terms need to modified prior to insertion into the keyword result field.
<Formatter
formatterClass="com.raritantechnologies.searchApp.formatters.KeywordExtractor"
outputField="[ name of new keyword field in result ]"
matchCase="[ UPPER | LOWER ]"
matcherNameField="[ name of field with IDocumentMatcher names (added by previous DocumentClassifier) ]"
documentClassifier="[ name of Document Classifier ]"
tokenizerString="[ optional string to use for tokenization ]"
numDuplicates="[ optional number to boost the number of keyword duplicates ]" >
<!-- result fields to search for keywords -->
<InputFields>
<Field ID="[result field name]" />
<Field ID="[another field name]" />
<!-- etc. -->
</InputFields>
<!-- can use a Term Extractor to get keywords -->
<TermExtractor class="[ class of com.raritantechnologies.utils.tagging.ITermExtractor ]" >
<!-- configuration parameters for TermExtractor -->
<!-- If the Term Extractor has the ability to extract different entity types: set the mapping -->
<!-- between extracted entity type and output result field -->
<EntityTypeFieldMap startsWith="true">
<!-- One or more Field tags: -->
<Field ID="[ result field ID ]" entityType="[ type of entity ]" />
</EntityTypeFieldMap>
</TermExtractor>
<!-- Alternatively, can use Keyword or thesaurus files which should have one keyword/phrase per line. -->
<KeywordFile fileName="name of keyword file" charSet="[ optional char set to use ]" >
<StringFilter class="[ class of com.raritantechnologies.utils.filter.IStringFilter ]" >
<!-- configuration parameters for String Filter -->
</StringFilter>
</KeywordFile>
<!-- Can list more than one keyword file -->
<KeywordFile fileName="name of second keyword file" >
<StringFilter class="[class of com.raritantechnologies.utils.filter.IStringFilter" >
<!-- configuration parameters for String Filter -->
</StringFilter>
</KeywordFile>
<!-- Can also add one or more RTI search sources that have keyword sets -->
<SearchSource sourceName="[ name of search source ]" >
<!-- query parameters to get keyword results -->
<QueryParam param="[query param name]" value="[query param value]" />
<QueryParam param="[another query param]" value="[another value]" />
<!-- Lookups can also be DYNAMIC: query value derived from another result field -->
<QueryParam param="[ name of source query param ]"
queryField="[ name of result field to get value for query ]" />
<!-- etc... -->
<!-- output fields to extract keywords from -->
<OutputField ID="[result field name]" />
<OutputField ID="[another result field name]" />
<!-- etc... -->
<!-- filter to apply to result fields -->
<StringFilter class="[class of com.raritantechnologies.utils.filter.IStringFilter ]" >
<!-- Configuration parameters for String Filter -->
</StringFilter>
</SearchSource>
</Formatter>
| Field Summary |
| Fields inherited from interface com.raritantechnologies.searchApp.IFieldFormatter |
TEMPLATE |
| Constructor Summary | |
KeywordExtractor()
|
|
| Method Summary | |
java.lang.String |
formatField(java.lang.String fieldVal)
Reformats a field value. |
java.lang.String |
formatField(java.lang.String sessionID,
java.lang.String fieldVal)
Reformats a field value. |
void |
formatResultField(IResult res)
Formats a result field "in place". |
void |
formatResultField(java.lang.String sessionID,
IResult res)
Formats a result field "in place", incorporating session context. |
java.lang.String |
getConfigurationXML()
|
java.lang.String |
getConfigurationXML(java.lang.String configurationTemplate)
|
java.lang.String |
getFieldName()
Returns the name of the result field that this formatter can reformat. |
void |
initialize(org.w3c.dom.Element elem)
Initializes the formatter from configuration XML element. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
public KeywordExtractor()
| Method Detail |
public void formatResultField(IResult res)
IFieldFormatter
formatResultField in interface IFieldFormatterres - The result object that is to be formatted.
public void formatResultField(java.lang.String sessionID,
IResult res)
IFieldFormatter
formatResultField in interface IFieldFormattersessionID - The session key needed to lookup any session content stored
in the session data cache.res - The result object that is to be formatted.public java.lang.String getFieldName()
IFieldFormatter
getFieldName in interface IFieldFormatterpublic java.lang.String formatField(java.lang.String fieldVal)
IFieldFormatter
formatField in interface IFieldFormatterfieldVal - The field value to be reformatted.
public java.lang.String formatField(java.lang.String sessionID,
java.lang.String fieldVal)
IFieldFormatter
formatField in interface IFieldFormattersessionID - The session key needed to lookup any session content stored
in the session data cache.fieldVal - The field value to be reformatted.
public void initialize(org.w3c.dom.Element elem)
initialize in interface IFieldFormatterpublic java.lang.String getConfigurationXML()
getConfigurationXML in interface IFieldFormatterpublic java.lang.String getConfigurationXML(java.lang.String configurationTemplate)
getConfigurationXML in interface IFieldFormatter
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||