MainGetting StartedSearch
Term Extractors

Term Extractors are classes that can extract a set of 'terms' (words, phrases, names, etc.) from a body of text. The basic interface for this is ITermExtractor. Term Extractors can be used for dynamic tagging, classification, clustering and vocabulary generation.

  • ClassifierTermExtractor
  • Uses a DocumentClassifier to classify the document text. All terms that match one or more of the classifier's document matchers are extracted.
  • RegExprTermExtractor
  • Uses Regular Expressions to extract terms from a string.
  • UIMATermExtractor
  • Term Extractor that uses a UIMA Analysis Engine to extract a set of terms from some document text. The UIMATermExtractor configuration points to a UIMA Text Analysis Engine (TAE) Configuration XML file.