TOC PREV NEXT INDEX




Using the DOM


The HTML 4 pilot parses incoming HTML documents, builds a DOM representation, and then lays out and renders the content. Once the DOM is built, you can access the information in the DOM and listen for DOM events using standard mechanisms specified by the W3C. The ice.pilots.html4 package implements the DOM Level 2 specification.

Note: To enable DOM2 API support, add the following code to your application.
    private static void setup_dom2(StormBase storm) {

        try {

            Class c = Class.forName("ice.dom.DOMImplementationImpl");

            DOM d = (DOM) c.newInstance();

             DOM.setInstance(storm, d);

        } catch (ClassNotFoundException cnfe) {

           if (Debug.trace) Debug.trace("dom2 " + cnfe);

        } catch (InstantiationException ie) {

            if (Debug.trace) Debug.trace("dom2 " + ie);

        } catch (IllegalAccessException iae) {

            if (Debug.trace) Debug.trace("dom2 " + iae);

        }

    }
 
DOM Fixer

If the content contains invalid HTML, the DOM may be expanded to contain additional nodes so that it conforms with the DTD for HTML 4.01.

For example, the DTD indicates that a <TABLE> element cannot be a child of a <P> element, which is the standard adopted by Netscape Communicator 4.6. However, both Mozilla and MSIE 5 allow a <TABLE> element to be a child of a <P> element. Due to the widespread use of such constructs, most ICEbrowser applications need to allow this as well.

For example, consider the following:

<P align=center> foo

<TABLE>

</TABLE>

foo2
 

In this example, foo, <TABLE>, </TABLE>, and foo2 are expected to be centered. However, the standard stipulates that only foo should be centered. As a result, the ICEbrowser DTD makes the same assumptions.

For more information on the HTML DOM fixer and the architecture of the HTML 4 pilot, see HTML Rendering Engine.

Accessing the DOM

To access the DOM:

  1. Set up a listener for the contentLoader, end property change event, which notifies you when the DOM has been fully constructed.
  2. After the DOM has been built, get a reference to the pilot and cast it to an instance of ice.pilots.html4.ThePilot.
  3. Call the getDocument( ) method, which returns an instance of org.w3c.dom.Document.

For example:

// Get DOM Document object from HTML 4 pilot

org.w3c.dom.Document doc = 
((ice.pilots.html4.ThePilot)p).getDocument( );
 

Once you have a reference to the document, you can use the standard DOM 2 API to access information about the document or to create your own elements and add them to the document.

You can also cast the document to an ice.pilots.html4.DDocument, which is the ICEbrowser implementation of the document interface. The DDocument class offers an API extended beyond org.w3c.dom.Document. For more information, refer to the API documentation.

Accessing the DOM with JavaScript

You can access the DOM using standard JavaScript calls. For example, this snippet of JavaScript shows how to retrieve an element by its ID and then change the value of an attribute:

function changeTableWidth(px){

    var table = document.getElementById("mainTable");

    table.setAttribute("width",px);

}
 

For related information, see Accessing Java Objects from JavaScript.

Listening for DOM Events

You can access elements of the DOM directly and add your own event listener, or you can use the API of ice.pilots.html4.ThePilot to register event listeners that persist across DOM creation and destruction.

For example, the following code adds a listener for mouse clicks on paragraph elements you create:

Element myElement = document.createElement("p");

((EventTarget)myElement).addEventListener("click", 

    new MouseClickListener(),false);
 

This example uses the ice.pilots.html4.ThePilot API:

pilot.addPersistentDOMEventListener ("click", new MouseClickListener(),

    true);
 

The DOM events that you can listen for are as follows:

Event Type
Events
Mouse Events
click
dblclick
mousedown
mousemove
mouseout
mouseover
mouseup
Key Events
keydown
keypress
keyup
HTML Events
abort
blur
change
error
focus
load
reset
resize
scroll
select
submit
unload
Mutation Events
CSSModified
DOMAttrModified
DOMNodeInserted
DOMNodeRemoved
MSIE-specific Events
contextmenu
help

Example: Intercepting a Mouse Click

To intercept a mouse click on a link, you need to traverse the DOM tree and register your code as an org.w3c.dom.events.EventListener on the DOM nodes that correspond to the A tags. Your code will then be called before the new location is loaded.

You also have the option of preventing the browser from following the link (event.preventDefault()).

You could register your event listener on the Document only, but then you would have to check the target node of the event and prevent the default action if the default node is an A element.

The following is a fragment of sample code for intercepting a mouse click on a link:

public class ClickListener implements EventListener{

    public void handleEvent(org.w3c.dom.events.Event e) {

    //concrete implementation of how to handle the event

    ...

    }

}
 
//Implement PropertyChangeListener in the top level container.
 
public void propertyChange(PropertyChangeEvent e) {

    Viewport v = (Viewport) e.getSource();

    String name = e.getPropertyName();

    String val = (String) e.getNewValue();
 
    if(name.equals("pilotLoading")&&val.equals("end")) {

        if (v.getPilot() instanceof ThePilot) {

            ThePilot pilot = (ThePilot)v.getPilot().;

            pilot.addPersistentDOMEventListener("click",

                new ClickListener(),true);

        }

    }

}
 
Saving a DOM Tree Dump File

For debugging purposes, it may be useful to print or save to a file the contents of the DOM for a given Web page. The Parser.java example, described in Example Files, shows how to do this without using a GUI. You can modify and integrate that code to suit your specific purposes.

Saving a DOM tree to a file is also implemented in the Generic RI with the use of a system property and a command line switch:

java <classpath> ice.browser.Main -p http://www.yahoo.com
 

By default, the output is saved to the file ice_output.txt in the current working directory.

You can specify your own location and file name by setting the ice.browser.parseTreeOutput system property, as in this example:

java <classpath> -Dice.browser.parseTreeOutput=c:\temp\dump.txt

    ice.browser.Main -p http://www.yahoo.com
 

You can use the Generic RI in this manner to obtain a DOM tree dump of any Web page.

Additional DOM Examples

For more examples of accessing and using the DOM, see Example Files.



Copyright 2005. ICEsoft Technologies, Inc.
http://www.icesoft.com

TOC PREV NEXT INDEX