com.raritantechnologies.HTML
Class HTMLScraperPageImportRenderer

java.lang.Object
  extended bycom.raritantechnologies.searchApp.taglibrary.PageImportRenderer
      extended bycom.raritantechnologies.HTML.HTMLScraperPageImportRenderer
All Implemented Interfaces:
IConfigurable, IPageContextRenderer

public class HTMLScraperPageImportRenderer
extends PageImportRenderer
implements IPageContextRenderer

Uses an HTMLScraper to get the final page. Scraper may be needed to do a deep scrape to get to a specific page described by a SearchProcess and set of Scraper XML configuration files.

A login process may also be used to get pages from secure sites.

Use the URLPageImportRenderer if a simple get or post request is needed. The HTMLScraperPageImportRenderer is used for "heavy-duty" page imports that may require several steps to be followed.

XML Configuration Template:
   <SystemObject type="PageImportRenderer" name="[The SysObject Name]"
                    configurableClass="com.raritantechnologies.HTML.HTMLScraperPageImportRenderer"
                    scraperConfig=" scraper config file name " 
                    resultIsHTML="false|true" >

    <Fields>
      <Field ID="[the field ID]" xPath="path into SearchProcess" />

      <Field ID="[the field ID]" xPath="path into SearchProcess" >
         <StringFilter class="[ an IStringFilter ]" >
         </StringFilter>
      </Field>

    </Fields>

    <LoginProcess>

    </LoginProcess>

    <SearchProcess>

    </SearchProcess>

    <OutputStringFilter class="[ class of com.raritantechnologies.utils.filter.IStringFilter" >

    </OutputStringFilter>

   </SystemObject>
 

Developed by Raritan Technologies Inc..

Author:
Ted Sullivan

Nested Class Summary
 
Nested classes inherited from class com.raritantechnologies.searchApp.taglibrary.PageImportRenderer
PageImportRenderer.PageElement
 
Field Summary
 
Fields inherited from class com.raritantechnologies.searchApp.taglibrary.PageImportRenderer
caching
 
Constructor Summary
HTMLScraperPageImportRenderer()
           
 
Method Summary
 java.lang.String getPage(RaritanPageContext pContext)
          returns an HTML page or page fragment given a set of request parameters.
 void initialize(org.w3c.dom.Element elem)
          Initializes the object from an XML tag or element.
 java.lang.String render(RaritanPageContext pContext)
          Returns the tag body.
 
Methods inherited from class com.raritantechnologies.searchApp.taglibrary.PageImportRenderer
addPageElement, getAddPersistent, getConfigurationXML, getFragmentFile, getPageHeader, getPageName, getPageTrailer, setAddPersistent, setPageHeader, setPageName, setPageTrailer, setStringFilter
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

HTMLScraperPageImportRenderer

public HTMLScraperPageImportRenderer()
Method Detail

render

public java.lang.String render(RaritanPageContext pContext)
Description copied from interface: IPageContextRenderer
Returns the tag body.

Specified by:
render in interface IPageContextRenderer
Overrides:
render in class PageImportRenderer

getPage

public java.lang.String getPage(RaritanPageContext pContext)
Description copied from class: PageImportRenderer
returns an HTML page or page fragment given a set of request parameters.

Overrides:
getPage in class PageImportRenderer
Parameters:
pContext - contains request and session parameters needed to execute the page retrieval.
Returns:
A string containing the page data.

initialize

public void initialize(org.w3c.dom.Element elem)
Description copied from interface: IConfigurable
Initializes the object from an XML tag or element. This method is called by the Framework as part of the application initializtion. see ConfigurationManager, XMLConfigurationManager, XMLSearchFieldMapFactory, XMLSearchSourceFactory. Configurable objects that are owned or contained by other configurable objects will be initialized in by the parent object.

Specified by:
initialize in interface IConfigurable
Overrides:
initialize in class PageImportRenderer