com.opensymphony.module.sitemesh.parser
Class HTMLPageParser

java.lang.Object
  extended by com.opensymphony.module.sitemesh.parser.HTMLPageParser
All Implemented Interfaces:
PageParser
Direct Known Subclasses:
DivExtractingPageParser

public class HTMLPageParser
extends java.lang.Object
implements PageParser

Builds an HTMLPage object from an HTML document. This behaves similarly to the FastPageParser, however it's a complete rewrite that is simpler to add custom features to such as extraction and transformation of elements.

To customize the rules used, this class can be extended and have the userDefinedRules() methods overridden.

Author:
Joe Walnes
See Also:
HTMLProcessor

Constructor Summary
HTMLPageParser()
           
 
Method Summary
protected  void addUserDefinedRules(State html, PageBuilder page)
           
 Page parse(char[] data)
          This builds a Page.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

HTMLPageParser

public HTMLPageParser()
Method Detail

parse

public Page parse(char[] data)
           throws java.io.IOException
Description copied from interface: PageParser
This builds a Page.

Specified by:
parse in interface PageParser
Throws:
java.io.IOException

addUserDefinedRules

protected void addUserDefinedRules(State html,
                                   PageBuilder page)

www.opensymphony.com/sitemesh/