com.raritantechnologies.HTML
Class HTMLScraperFilter
java.lang.Object
com.raritantechnologies.HTML.HTMLScraperFilter
- All Implemented Interfaces:
- IConfigurable, IStringFilter
- public class HTMLScraperFilter
- extends java.lang.Object
- implements IStringFilter
Uses an HTMLScraper to filter a URL to an XML
string scraped from the HTML page referenced by the URL.
XML Configuration Template:
<StringFilter class="com.raritantechnologies.HTML.HTMLScraperFilter"
xmlScraperConfig="[Scraper Config XML file]"
xslTransform="[ optional XSL transform file]"
httpMethod="get(default) | post | none (if input string is HTML)"
dataXPath="path into SearchProcess for input string]"
outputType="HTML|XML(default)"
startsWith="[ beginning of output page ]" >
<!-- Alternative to httpMethod (which assumes that input string is a URL) -->
<!-- Use SearchProcess and/or LoginProcess input map: input string is -->
<!-- a term -->
<LoginProcess>
<Step URL="[URL of login page]" >
<params>
<param formName="UserName" >
</params>
</Step>
</LoginProcess>
<SearchProcess dataXPath="[ alternate way of specifiying dataXPath ]" >
</SearchProcess>
</StringFilter>
Developed by
Raritan Technologies Inc..
- Author:
- Ted Sullivan
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
HTMLScraperFilter
public HTMLScraperFilter()
HTMLScraperFilter
public HTMLScraperFilter(java.lang.String xmlScraperFile)
HTMLScraperFilter
public HTMLScraperFilter(java.lang.String xmlScraperFile,
java.lang.String httpRequestMethod)
initialize
public void initialize(org.w3c.dom.Element elem)
- Description copied from interface:
IConfigurable
- Initializes the object from an XML tag or element.
This method is called by the Framework as part of the application initializtion.
see ConfigurationManager, XMLConfigurationManager, XMLSearchFieldMapFactory, XMLSearchSourceFactory.
Configurable objects that are owned or contained by other configurable objects will be initialized
in by the parent object.
- Specified by:
initialize in interface IConfigurable
getConfigurationXML
public java.lang.String getConfigurationXML()
- Specified by:
getConfigurationXML in interface IStringFilter
getConfigurationXML
public java.lang.String getConfigurationXML(java.lang.String configurationTemplate)
- Specified by:
getConfigurationXML in interface IStringFilter
filterString
public java.lang.String filterString(java.lang.String inputStr)
- Specified by:
filterString in interface IStringFilter
filterString
public java.lang.String filterString(java.util.Map parameters,
java.lang.String inputString)
- Specified by:
filterString in interface IStringFilter
filterString
public java.lang.String filterString(java.lang.String sessionID,
java.lang.String inputStr)
- Specified by:
filterString in interface IStringFilter
getSearchProcessPage
public java.lang.String getSearchProcessPage(java.lang.String sessionID,
java.lang.String inputString)
setSearchProcess
public void setSearchProcess(org.w3c.dom.Element searchProcess)
setLoginProcess
public void setLoginProcess(org.w3c.dom.Element loginProcess)
setXMLScraperFile
public void setXMLScraperFile(java.lang.String xmlScraperFile)
setHttpRequestMethod
public void setHttpRequestMethod(java.lang.String httpRequestMethod)
setXSLTransformFile
public void setXSLTransformFile(java.lang.String xslTransformFile)
setXSLTransformer
public void setXSLTransformer(javax.xml.transform.Transformer xslTransformer)
setHTMLScraperConfig
public void setHTMLScraperConfig(org.w3c.dom.Document htmlScraperConfig)
setUserName
public void setUserName(java.lang.String userName)
setPassword
public void setPassword(java.lang.String password)
setDataXPath
public void setDataXPath(java.lang.String dataXPath)
setOutputType
public void setOutputType(java.lang.String outputType)