In this section we shall parse an XML document (the XML document that was created in the previous section) with a DOM parser. DOM parsing creates an in-memory tree-like structure of an XML document, which may be navigated with the DOM API. We shall iterate over the XML document parsed, and output elements and attribute node values.
Create a JXDcoumentBuilderFactory object with the static method newInstance(). The factory object is used to obtain a parser that may be used to create a DOM document tree from an XML document:
Set the ERROR_STREAM and SHOW_WARNINGS attributes on the factory object with the setAttribute() method. The ERROR_STREAM attribute specifies the error stream, while the SHOW_WARNINGS attribute specifies if warnings are to be shown. The value of the ERROR_STREAM attribute is an OutputStream object or a PrintWriter object. The value of the SHOW_WARNINGS attribute is a Boolean, which can be set to Boolean.TRUE or Boolean.FALSE. With the OutputStream or PrintWriter specified in the ERROR_STREAM attribute, parsing errors (if any) get outputted to the specified file. If ErrorHandler is also set, ERROR_STREAM is not used. The SHOW_WARNINGS attribute outputs warnings also:
Creating a DOM document object
Create a JXDocumentBuilder object from the factory object by first creating a DocumentBuilder object with newDocumentBuilder() method and subsequently casting the DocumentBuilder object to JXDocumentBuilder. JXDocumentBuilder is the implementation class in Oracle XDK 11g for the abstract class DocumentBuilder:
The JXDocumentBuilder object is used to create a DOM document object from an XML document. A Document object may be obtained using the JXDocumentBuilder object with one of the parse() methods in the JXDocumentBuilder class. The input to the parser may be specified as InputSource, InputStream, File object, or a String URI. Create an InputStream for the example XML document and parse the document with the parse(InputStream) method:
The parse() methods of the JXDocumentBuilder object return a Document object, which may be cast to an XMLDocument object, as the XMLDocument class implements the Document interface.
Outputting the XML document components' values
Output the encoding in the XML document using the getEncoding method, and output the version of the XML document using the getVersion method:
The XMLDocument class has various getter methods to retrieve elements in a document. Some of these methods are listed in the following table:
As an example, retrieve title elements in the namespace http://xdk.com/catalog/journal using the getElementsByTagNameNS method:
Iterate over the NodeList to output element namespace, element namespace prefix, element tag name, and element text. The getNamespaceURI() method returns the namespace URI of an element. The getPrefix() method returns the prefix of an element in a namespace. The getTagName() method returns the element tag name. Element text is obtained by first obtaining the text node within the element node using the getFirstChild() method and subsequently the value of the text node:
Obtain the root element in the XML document with the getDocumentElement() method. The getDocumentElement method returns an Element object that may be cast to an XMLElement object if any of the methods defined only in the XMLElement class are to be used. The Element object is not required to be cast to an XMLElement object. We have cast the Element object to XMLElement as XMLElement is Oracle XDK 11g's implementation class for the Element interface, and we are discussing Oracle XDK 11g:
Next, we shall iterate over all the subnodes of the root element. Obtain a NodeList of subnodes of the root element with the getChildNodes() method. Create a method iterateNodeList() to iterate over the subnodes of an Element. Iterate over the NodeList and recursively obtain the subelements of the elements in the NodeList. The method hasChildNodes() tests to see if a node has subnodes. Ignorable whitespace is also considered a node, but we are mainly interested in the subelements in a node. The NodeList interface method getLength() returns the length of a node list, and method item(int) returns the Node at a specified index. As class XMLNode is Oracle XDK 11g's implementation class for the Node interface, cast the Node object to XMLNode:
If a node is of type element, the tag name of the element may be retrieved. Node type is obtained with the getNodeType() method, which returns a short value. The Node interface provides static fields for different types of nodes. The different types of nodes in an XML document are listed in the following table:
For an element node, cast the node to XMLElement and output the element tag name:
The attributes in a element node are retrieved with the getAttributes() method, which returns a NamedNodeMap of attributes. The getLength() method of NamedNodeMap returns the length of an attribute node list. The method item(int) returns an Attr object for the attribute at the specified index. As class XMLAttr implements the Attr interface, cast the Attr object to XMLAttr. Iterate over the NamedNodeMap to output the attribute name and value. The hasAttributes() method tests if an element node has attributes:
Running the Java application
The complete DOMParserApp.java Java application code listing is listed as follows with notes about the different sections in the Java class:
1. First, we add the package and import statements.
2. Next, we add Java class DOMParserApp.
3. Then, we add the parseXMLDocument method to parse an XML document.
4. Now, we create the XMLDocument object by parsing the XML document catalog.xml.
5. Here, we output the document character encoding, the XML version, and namespace node values from the parsed XML document.
6. Next, we obtain the subnodes of the root element and invoke the iterateNodeList method to iterate over the subnodes.
7. The iterateNodeList method has an Element parameter, which represents the element with subnodes. The second parameter is of the type NodeList, which is the NodeList of subnodes of the Element represented by the first parameter.
8. Iterate over the NodeList.
9. If a node is of type Element, output the Element tag name and element text.
10. If an Element has attributes, output the attributes.
11. If an Element has subnodes, obtain the NodeList of subnodes and iterate over the NodeList by invoking the iterateNodeList method again.
12. Finally, we add the main method. In the main method, we create an instance of the DOMParserApp class and invoke the parseXMLDocument method.
13. To run the DOMParserApp.java in JDeveloper, right-click on the DOMParserApp.java node in Application Navigator and select Run.
14. The element and attribute values from the XML document get outputted.
The complete output from the DOM parsing application is as follows:
To demonstrate error handling with the ERROR_STREAM attribute, add an error in the example XML document. For example, remove a </journal> tag. Run the DOMParserApp.java application in JDeveloper. An error message gets outputted to the file specified in the ERROR_STREAM attribute: