Package com.raritantechnologies.HTML

Contains classes for HTML processing - web mining, HTML filtering, etc.

See:
          Description

Interface Summary
IHTMLScraperProcessor PostProcessor for HTMLScraper.
IStepProcess Configurable SearchProcess or LoginProcess Step element processor.
IURLValidator Interface for objects that can validate a URL and/or a user's access to a URL.
 

Class Summary
AbstractPostProcessor Base class for IHTMLScraperPostProcessors.
DynamicHttpServletRequest Wrapper proxy subclass for HttpServletRequest which can have Parameters added as it passes through a servlet.
HTMLScraper Converts an HTML file to an XML DOM object.
HTMLScraperFilter Uses an HTMLScraper to filter a URL to an XML string scraped from the HTML page referenced by the URL.
HTMLScraperFormatter Uses an HTMLScraper to extract metadata from HTML and adds these properties to the formatted IResult instance.
HTMLScraperGateway
HTMLScraperPageImportRenderer Uses an HTMLScraper to get the final page.
HTMLStringFilter Uses an HTMLFilter to manipulate an HTML string.
HTTPRestOutputProcessor Uses HTTP REST API to execute some action on one or more IResult metadata objects.
HTTPRestSearchSource SearchSource that accesses an HTTP REST API (XML over HTTP).
HTTPRestSearchSourceFactory SearchSource that accesses an HTTP REST API (XML over HTTP).
HttpServletResponseStub Proxy for an HttpServletResponse with an outputStream inserted to divert Http response outputs.
PostProcessorStub
RemoveHTMLFilter Removes HTML Markup from an input string.
ServletOutputStreamStub
 

Package com.raritantechnologies.HTML Description

Contains classes for HTML processing - web mining, HTML filtering, etc.