com.raritantechnologies.HTML
Class HTMLScraperFilter

java.lang.Object
  extended bycom.raritantechnologies.HTML.HTMLScraperFilter
All Implemented Interfaces:
IConfigurable, IStringFilter

public class HTMLScraperFilter
extends java.lang.Object
implements IStringFilter

Uses an HTMLScraper to filter a URL to an XML string scraped from the HTML page referenced by the URL.

XML Configuration Template:
   <StringFilter class="com.raritantechnologies.HTML.HTMLScraperFilter"
                    xmlScraperConfig="[Scraper Config XML file]"
                    xslTransform="[ optional XSL transform file]"
                    httpMethod="get(default) | post | none (if input string is HTML)"
                    dataXPath="path into SearchProcess for input string]"
                    outputType="HTML|XML(default)"
                    startsWith="[ beginning of output page ]" >

     <!-- Alternative to httpMethod (which assumes that input string is a URL) -->
     <!-- Use SearchProcess and/or LoginProcess input map: input string is   -->
     <!-- a term --> 
     <LoginProcess>
         <Step URL="[URL of login page]" >
            <params>
              <param formName="UserName" >
            </params>
         </Step>
     </LoginProcess>

     <SearchProcess dataXPath="[ alternate way of specifiying dataXPath ]" >

     </SearchProcess>
   </StringFilter>
 

Developed by Raritan Technologies Inc..

Author:
Ted Sullivan

Field Summary
 
Fields inherited from interface com.raritantechnologies.utils.filter.IStringFilter
TEMPLATE
 
Constructor Summary
HTMLScraperFilter()
           
HTMLScraperFilter(java.lang.String xmlScraperFile)
           
HTMLScraperFilter(java.lang.String xmlScraperFile, java.lang.String httpRequestMethod)
           
 
Method Summary
 java.lang.String filterString(java.util.Map parameters, java.lang.String inputString)
           
 java.lang.String filterString(java.lang.String inputStr)
           
 java.lang.String filterString(java.lang.String sessionID, java.lang.String inputStr)
           
 java.lang.String getConfigurationXML()
           
 java.lang.String getConfigurationXML(java.lang.String configurationTemplate)
           
 java.lang.String getSearchProcessPage(java.lang.String sessionID, java.lang.String inputString)
           
 void initialize(org.w3c.dom.Element elem)
          Initializes the object from an XML tag or element.
 void setDataXPath(java.lang.String dataXPath)
           
 void setHTMLScraperConfig(org.w3c.dom.Document htmlScraperConfig)
           
 void setHttpRequestMethod(java.lang.String httpRequestMethod)
           
 void setLoginProcess(org.w3c.dom.Element loginProcess)
           
 void setOutputType(java.lang.String outputType)
           
 void setPassword(java.lang.String password)
           
 void setSearchProcess(org.w3c.dom.Element searchProcess)
           
 void setUserName(java.lang.String userName)
           
 void setXMLScraperFile(java.lang.String xmlScraperFile)
           
 void setXSLTransformer(javax.xml.transform.Transformer xslTransformer)
           
 void setXSLTransformFile(java.lang.String xslTransformFile)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

HTMLScraperFilter

public HTMLScraperFilter()

HTMLScraperFilter

public HTMLScraperFilter(java.lang.String xmlScraperFile)

HTMLScraperFilter

public HTMLScraperFilter(java.lang.String xmlScraperFile,
                         java.lang.String httpRequestMethod)
Method Detail

initialize

public void initialize(org.w3c.dom.Element elem)
Description copied from interface: IConfigurable
Initializes the object from an XML tag or element. This method is called by the Framework as part of the application initializtion. see ConfigurationManager, XMLConfigurationManager, XMLSearchFieldMapFactory, XMLSearchSourceFactory. Configurable objects that are owned or contained by other configurable objects will be initialized in by the parent object.

Specified by:
initialize in interface IConfigurable

getConfigurationXML

public java.lang.String getConfigurationXML()
Specified by:
getConfigurationXML in interface IStringFilter

getConfigurationXML

public java.lang.String getConfigurationXML(java.lang.String configurationTemplate)
Specified by:
getConfigurationXML in interface IStringFilter

filterString

public java.lang.String filterString(java.lang.String inputStr)
Specified by:
filterString in interface IStringFilter

filterString

public java.lang.String filterString(java.util.Map parameters,
                                     java.lang.String inputString)
Specified by:
filterString in interface IStringFilter

filterString

public java.lang.String filterString(java.lang.String sessionID,
                                     java.lang.String inputStr)
Specified by:
filterString in interface IStringFilter

getSearchProcessPage

public java.lang.String getSearchProcessPage(java.lang.String sessionID,
                                             java.lang.String inputString)

setSearchProcess

public void setSearchProcess(org.w3c.dom.Element searchProcess)

setLoginProcess

public void setLoginProcess(org.w3c.dom.Element loginProcess)

setXMLScraperFile

public void setXMLScraperFile(java.lang.String xmlScraperFile)

setHttpRequestMethod

public void setHttpRequestMethod(java.lang.String httpRequestMethod)

setXSLTransformFile

public void setXSLTransformFile(java.lang.String xslTransformFile)

setXSLTransformer

public void setXSLTransformer(javax.xml.transform.Transformer xslTransformer)

setHTMLScraperConfig

public void setHTMLScraperConfig(org.w3c.dom.Document htmlScraperConfig)

setUserName

public void setUserName(java.lang.String userName)

setPassword

public void setPassword(java.lang.String password)

setDataXPath

public void setDataXPath(java.lang.String dataXPath)

setOutputType

public void setOutputType(java.lang.String outputType)