Converting XML to PDF

Exclusive offer: get 50% off this eBook here
Processing XML documents with Oracle JDeveloper 11g

Processing XML documents with Oracle JDeveloper 11g — Save 50%

Create, validate, and transform XML documents with Oracle's JDeveloper IDE using this book and eBook

$32.99    $16.50
by Deepak Vohra | April 2009 | Oracle

In this article by Deepak Vohra, you will learn to convert a XML document to a PDF document. This process will include setting the environment, converting the XML document to XSL-FO, parsing the XML document which further explains generating XSL-FO document, after which we finally move to converting the XSL-FO document to a PDF document.

XML is the most suitable format for data exchange, but not for data presentation. Adobe's PDF and Microsoft Excel's spreadsheet are the commonly used formats for data presentation. If you receive an XML file containing data that needs to be included in a PDF or Excel spreadsheet file, you need to convert the XML file to the relevant format. Some of the commonly used XML-to-PDF conversion tools/APIs are discussed in the following table:

Tool/API

Description

iText

iText is a Java library for generating a PDF document. iText may be downloaded from http://www.lowagie.com/iText/.

Stylus Studio XML Publisher

XML Publisher is a report designer, which supports many data sources, including XML, to generate PDF reports.

Stylus Studio XML Editor

XML Editor supports XML conversion to PDF.

Apache FOP

The Apache project provides an open source FO processor called Apache FOP to render an XSL-FO document as a PDF document. We will discuss the Apache FOP processor in this chapter.

XMLMill for Java

XMLMill may be used to generate PDF documents from XML data combined with XSL and XSLT. XMLMill may be downloaded from http://www.xmlmill.com/.

RenderX XEP

RenderX provides an XSL-FO processor called XEP that may be used to generate a PDF document.

 

We can convert an XML file data to a PDF document using any of these tools/APIs in JDeveloper 11g.

Here, we will use Apache FOP API.

The XSL specification consists of two components: a language for transforming XML documents (XSLT), and XML syntax for specifying formatting objects (XSL-FO). Using XSL-FO, the layout, fonts, and representations of the data may be formatted. Apache FOP (Formatting Objects Processor) is a print formatter for converting XSL formatting objects (XSL-FO) to an output format such as PDF, PCL, PS, SVG, XML, Print, AWT, MIF, or TXT. In this  article, we will convert an XML document to PDF using XSL-FO and the FOP processor in Oracle JDeveloper 11g.The procedure to create a PDF document from an XML file using the Apache FOP processor in JDeveloper is as follows:

  1. Create an XML document.
  2. Create an XSL stylesheet.
  3. Convert the XML document to an XSL-FO document.
  4. Convert the XSL-FO document to a PDF file.

Setting the environment

We need to download the FOP JAR file fop-0.20.5-bin.zip (or a later version) from http://archive.apache.org/dist/xmlgraphics/fop/binaries/ and extract the ZIP file to a directory. To develop an XML-to-PDF conversion application, we need to create an application (ApacheFOP, for example) and a project (ApacheFOP for example) in JDeveloper. In the project add an XML document, catalog.xml, with File | New. In the New Gallery window select Categories | General | XML and Items | XML Document. Click on OK. In the Create XML File window specify a File Name, catalog.xml, and click on OK. A catalog.xml file gets added to the ApacheFOP project. Copy the following catalog.xml listing to catalog.xml:

<?xml version="1.0" encoding="UTF-8"?>
<catalog title="Oracle Magazine" publisher="Oracle Publishing">
<journal edition="September-October 2008">
<article>
<title>Share 2.0</title>
<author>Alan Joch</author>
</article>
<article>
<title>Restrictions Apply</title>
<author>Alan Joch</author>
</article>
</journal>
<journal edition="March-April 2008">
<article>
<title>Oracle Database 11g Redux</title>
<author>Tom Kyte</author>
</article>
<article>
<title>Declarative Data Filtering</title>
<author>Steve Muench</author>
</article>
</journal>
</catalog>

We also need to add an XSL stylesheet to convert the XML document to an XSL-FO document. Create an XSL stylesheet with File | New. In the New Gallery window, select Categories | General | XML and Items | XSL Stylesheet. Click on OK. In the Create XSL File window specify a File Name (catalog.xsl) and click on OK. A catalog.xsl file gets added to the ApacheFOP project. To convert the XML document to an XSL-FO document and subsequently create a PDF file from the XSL-FO file, we need a Java application. Add a Java class,  XMLToPDF.java, with File | New. In the New Gallery window select Categories | General and Items | Java Class. Click on OK. In the Create Java Class window specify a class Name (XMLToPDF for example) and click on OK. A Java class gets added to the ApacheFOP project. The directory structure of the FOP application is shown in the following illustration:

Converting XML to PDF

Next, add the FOP JAR files to the project. Select the project node (ApacheFOP node) and then Tools | Project Properties. In the Project Properties window, select Libraries and Classpath. Add the Oracle XML Parser v2 library with the  Add Library button. The JAR files required to develop an FOP application are listed in the following table:

JAR File

Description

<FOP>/fop-0.20.5/build/fop.jar

Apache FOP API

<FOP>/fop-0.20.5/lib/batik.jar

Graphics classes

<FOP>/fop-0.20.5/lib/ avalon-framework-cvs-20020806.jar

Logger classes

<FOP>/fop-0.20.5/lib/ xercesImpl-2.2.1.jar

The DOMParser and the SAXParser classes

Processing XML documents with Oracle JDeveloper 11g Create, validate, and transform XML documents with Oracle's JDeveloper IDE using this book and eBook
Published: May 2009
eBook Price: $32.99
Book Price: $54.99
See more
Select your format and quantity:

The variable is the directory in which Apache FOP is installed. Add the JAR files with the Add JAR/Directory button. Click on  OK in the Project Properties window.

Converting XML to PDF

Converting XML to XSL-FO

In this section, we will convert the example XML document (catalog.xml) to an XSL-FO document. An XSL-FO document includes formatting information about the data to be presented. It includes the layout, fonts, and tables in the document. An XSL-FO document is created in the fo prefix namespace using the namespace declaration xmlns:fo=http://www.w3.org/1999/XSL/Format. The root element of the XSL-FO document is fo:root. The XSL-FO document elements are based on the fo.dtd DTD, which may be downloaded from http://www.syntext.com/products/dtd2xs/doc/fo.dtd. Some of the commonly used elements in an XSL-FO document are listed here:

Processing XML documents with Oracle JDeveloper 11g Create, validate, and transform XML documents with Oracle's JDeveloper IDE using this book and eBook
Published: May 2009
eBook Price: $32.99
Book Price: $54.99
See more
Select your format and quantity:

The example XML document to be converted to a PDF document consists of a journal catalog. The XSLT stylesheet catalog.xsl, from which the example XML document is converted to an XSL-FO document, is listed as follows:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fo="http://www.w3.org/1999/XSL/Format" exclude-result-
prefixes="fo">
<xsl:output method="xml" version="1.0" omit-xml-declaration="no"
indent="yes"/>
<!-- ========================= -->
<!-- root element: catalog -->
<!-- ========================= -->
<xsl:template match="/catalog">
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
<fo:layout-master-set>
<fo:simple-page-master master-name="simpleA4" page-
height="29.7cm"
page-width="21cm" margin-top="2cm" margin-bottom="2cm"
margin-left="2cm" margin-right="2cm">
<fo:region-body/>
</fo:simple-page-master>
</fo:layout-master-set>
<fo:page-sequence master-reference="simpleA4">
<fo:flow flow-name="xsl-region-body">
<fo:block font-size="16pt" font-weight="bold" space-
after="5mm">
Catalog: <xsl:value-of select="@title"/>
</fo:block>
<fo:block font-size="16pt" font-weight="bold" space-after="5mm">
Publisher: <xsl:value-of select="@publisher"/>
</fo:block>
<fo:block font-size="10pt">
<fo:table table-layout="fixed">
<fo:table-column column-width="4cm"/>
<fo:table-column column-width="4cm"/>
<fo:table-column column-width="5cm"/>
<fo:table-header>
<fo:table-row font-weight="bold"><fo:table-cell>
<fo:block>
<xsl:text>Edition</xsl:text>
</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>
<xsl:text>Title</xsl:text>
</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>
<xsl:text>Author</xsl:text>
</fo:block>
</fo:table-cell>
</fo:table-row>
</fo:table-header>
<fo:table-body>
<xsl:apply-templates select="journal"/>
</fo:table-body>
</fo:table>
</fo:block>
</fo:flow>
</fo:page-sequence>
</fo:root>
</xsl:template>
<xsl:template match="journal">
<xsl:for-each select="article">
<fo:table-row>
<fo:table-cell>
<fo:block>
<xsl:value-of select="../@edition"/>
</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>
<xsl:value-of select="title"/>
</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>
<xsl:value-of select="author"/>
</fo:block>
</fo:table-cell>
</fo:table-row>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>

Copy the catalog.xsl listing to the catalog.xsl file in the JDeveloper project in the Application Navigator. Next, we will convert the example XML document to an XSL-FO document in the Java application XMLToPDF.java.

Parsing the XML document

Create a DocumentBuilderFactory object using the static method newIntance(). The factory class is used to create a DocumentBuilder parser.

DocumentBuilderFactory factory =DocumentBuilderFactory.newInstance();

Create a DocumentBuilder parser from the DocumentBuilderFactory object using the newDocumentBuilder() method.

DocumentBuilder builder = factory.newDocumentBuilder();

Parse the example XML document using one of the overloaded parse() methods.

File xmlFile = new File("catalog.xml");
Document document = builder.parse(xmlFile);

Generating the XSL-FO document

Create a TransformerFactory object using the static method newInstance(). The factory class is used to create a Transformer object.

TransformerFactory transformerFactory = TransformerFactory.newInstance();

Create a Transformer object from the TransformerFactory object using the newTransformer() method.

File stylesheet = new File("catalog.xsl");
Transformer transformer = transformerFactory.newTransformer(new StreamSource(stylesheet));

Transform the Document object obtained from the example XML document using the method transform(Source, Result). The input XML document may be specified as a DOMSource, SAXSource, or StreamSource object. The transformation output may be specified as DOMResult, SAXResult, or StreamResult.

DOMSource source = new DOMSource(document);
StreamResult result = new StreamResult(new File("catalog.fo"));
transformer.transform(source, result);

The XSL-FO document, catalog.fo, generated from the example XML document is listed as follows:

<?xml version = '1.0' encoding = 'UTF-8'?>
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
<fo:layout-master-set>
<fo:simple-page-master master-name="simpleA4" page-
height="29.7cm" page-width="21cm" margin-top="2cm"
margin-bottom="2cm" margin-left="2cm" margin-right="2cm">
<fo:region-body/>
</fo:simple-page-master>
</fo:layout-master-set>
<fo:page-sequence master-reference="simpleA4">
<fo:flow flow-name="xsl-region-body">
<fo:block font-size="16pt" font-weight="bold" space-
after="5mm">
Catalog: Oracle Magazine</fo:block>
<fo:block font-size="16pt" font-weight="bold" space-
after="5mm">
Publisher: Oracle Publishing</fo:block>
<fo:block font-size="10pt">
<fo:table table-layout="fixed">
<fo:table-column column-width="4cm"/>
<fo:table-column column-width="4cm"/>
<fo:table-column column-width="5cm"/>
<fo:table-header>
<fo:table-row font-weight="bold">
<fo:table-cell>
<fo:block>Edition</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>Title</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>Author</fo:block>
</fo:table-cell>
</fo:table-row>
</fo:table-header>
<fo:table-body>
<fo:table-row>
<fo:table-cell>
<fo:block>July-August 2005</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>Tuning Undo Tablespace </fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>
Kimberly Floss
</fo:block>
</fo:table-cell>
</fo:table-row>
<fo:table-row>
<fo:table-cell>
<fo:block>July-August 2005</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>Browsing and Editing Data</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>
Steve Muench
</fo:block>
</fo:table-cell>
</fo:table-row>
<fo:table-row>
<fo:table-cell>
<fo:block>September-October 2005</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>Sharing Memory—Automatically</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>Kimberly Floss </fo:block>
</fo:table-cell>
</fo:table-row>
<fo:table-row>
<fo:table-cell>
<fo:block>September-October 2005</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>Creating Search Pages </fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>
Steve Muench
</fo:block>
</fo:table-cell>
</fo:table-row>
</fo:table-body>
</fo:table>
</fo:block>
</fo:flow>
</fo:page-sequence>
</fo:root>

Converting XSL-FO to PDF

In this section, we will convert the XSL-FO document generated in the previous section to a PDF document using the FOP driver. In the Java application  XMLToPDF.java, import the FOP Driver class and the logger classes required to convert an XML document to a PDF document.

import org.apache.fop.apps.Driver;
import org.apache.avalon.framework.logger.Logger;
import org.apache.avalon.framework.logger.ConsoleLogger;

Creating the FOP driver

Create an FOP driver object using the constructor for the Driver class.

Driver driver=new Driver();

Create a logger with level setting LEVEL_INFO  using the ConsoleLogger constructor.

Logger logger=new ConsoleLogger(ConsoleLogger.LEVEL_INFO);

Set the logger on the FOP driver using the setLogger method, and set the logger on the MessageHandler using the setScreenLogger method.

driver.setLogger(logger);
org.apache.fop.messaging.MessageHandler.setScreenLogger(logger);

Set the renderer for the FOP driver using the setRenderer method. For conversion to a PDF file, specify Driver.RENDER_PDF as the renderer.

driver.setRenderer(Driver.RENDER_PDF);

Generating the PDF document

Specify the XSL-FO document that is to be converted to a PDF document. The XSL-FO document, which was generated from an XML document previously in this article, is used to generate a PDF document. An XSL-FO object is set on a Driver object using the setInputSource(InputSource) method.

File xslFOFile=new File("catalog.fo");
InputStream input=new FileInputStream(xslFOFile);
driver.setInputSource(new InputSource(input));

Specify an output PDF document using the setOutputStream(OutputStream) method.

File pdfFile=new File("catalog.pdf");
OutputStream output=new FileOutputStream(pdfFile);
driver.setOutputStream(output);

Run the FOP driver to generate a PDF document using the run method.

driver.run();

Running the Java application

The Java application, XMLToPDF.java, used for converting an XML document to a PDF document is listed here with explanations:

  1. First, we add the package and import statements.
      package apachefop;
      import org.apache.fop.apps.Driver;
      import java.io.*;
      import org.apache.avalon.framework.logger.ConsoleLogger;
      import org.apache.avalon.framework.logger.Logger;
      import org.apache.fop.apps.FOPException;
      import org.xml.sax.InputSource;
      import javax.xml.transform.*;
      import javax.xml.transform.stream.StreamSource;
      import javax.xml.transform.stream.StreamResult;
      import javax.xml.parsers.DocumentBuilder;
      import javax.xml.parsers.DocumentBuilderFactory;
      import javax.xml.parsers.ParserConfigurationException;
      import javax.xml.transform.dom.DOMSource;
      import org.xml.sax.SAXException;
      import org.w3c.dom.Document;
  2. We define the Java class XMLToPDF
      public class XMLToPDF
      {
      public XMLToPDF()
      {
      }
  3. Next, we add the Java method xmlToFO to convert an XML document to an XSL-FO document.
      public void xmlToFO(){
      try{
      DocumentBuilderFactory factory =
      DocumentBuilderFactory.newInstance();
      File stylesheet = new File("catalog.xsl");
      File xmlFile = new File("catalog.xml");
      DocumentBuilder builder = factory.newDocumentBuilder();
      Document document = builder.parse(xmlFile);
      TransformerFactory transformerFactory = TransformerFactory.newInstance();
      Transformer transformer = transformerFactory.newTransformer(new StreamSource(stylesheet));
      DOMSource source = new DOMSource(document);
      StreamResult result = new StreamResult(new File("catalog.fo"));
      transformer.transform(source, result);
      }catch(TransformerConfigurationException e)
      {System.err.println("TransformerConfigurationException: "+e.getMessage());}
      catch(TransformerException e)
      {System.err.println("TransformerException: "+e.getMessage());}
      catch(ParserConfigurationException e)
      {System.err.println("TransformerConfigurationException: "+e.getMessage());}
      catch(IOException e){System.err.println("TransformerException: "+e.getMessage());}
      catch(SAXException e){System.err.println("TransformerException: "+e.getMessage());}
      }
  4. We add the Java method foToPDF to convert a XSL-FO document to a PDF document.
      public void foToPDF(){
      try{
      Driver driver=new Driver();
      Logger logger=new ConsoleLogger(ConsoleLogger.LEVEL_INFO);
      driver.setLogger(logger);
      org.apache.fop.messaging.MessageHandler
      .setScreenLogger(logger);
      ?driver.setRenderer(Driver.RENDER_PDF);
      File xslFOFile=new File("catalog.fo");
      File pdfFile=new File("catalog.pdf");
      InputStream input=new FileInputStream(xslFOFile);
      driver.setInputSource(new InputSource(input));
      OutputStream output=new FileOutputStream(pdfFile);
      driver.setOutputStream(output);
      driver.run();
      output.flush();
      output.close();
      }catch(IOException e){System.err.println(„IOException: „+e.getMessage());}
      catch(FOPException e){System.err.println(„FOPException: "+e.getMessage());}
      }
  5. Finally, we add the main method in which we create an instance of the XMLToPDF class and invoke the xmlToFO and fopToPDF methods.
      public static void main(String[] argv){
      XMLToPDF fop=new XMLToPDF();
      fop.xmlToFO();
      fop.foToPDF();
      }
      }

Copy the XMLToPDF.java listing to the XMLToPDF.java class in the JDeveloper project ApacheFOP. To run the Java application, right-click on the Java application node in the Application Navigator, and select Run.

Converting XML to PDF

The output from the application run indicates that org.pache.xerces.parsers.SAXParser is used to parse the XML document. A formatting object gets built, the fonts get set, and a PDF document gets generated. Select View | Refresh to add the catalog.pdf document generated from the example XML document catalog.xml to the ApacheFOP project.

Converting XML to PDF

The catalog.pdf document generated is shown as follows:

Converting XML to PDF

We did not include any graphics within the PDF generated, but Apache FOP supports the inclusion of graphics within PDF documents. Refer to http://xmlgraphics.apache.org/fop/0.94/graphics.html  for graphics support in Apache FOP.

Summary

In this article, we successfully converted an example XML document, catalog.xml, to a PDF document, catalog.pdf, using the Apache FOP driver in JDeveloper 11g.

About the Author :


Deepak Vohra

Deepak Vohra is a consultant and a principal member of the NuBean.com software company. Deepak is a Sun Certified Java Programmer and Web Component Developer, and has worked in the fields of XML and Java programming and J2EE for over five years. Deepak is the co-author of the Apress book Pro XML Development with Java Technology and was the technical reviewer for the O'Reilly book WebLogic: The Definitive Guide. Deepak was also the technical reviewer for the Course Technology PTR book Ruby Programming for the Absolute Beginner, and the technical editor for the Manning Publications book Prototype and Scriptaculous in Action. Deepak is also the author of the Packt Publishing books JDBC 4.0 and Oracle JDeveloper for J2EE Development; Processing XML documents with Oracle JDeveloper 11g; EJB 3.0 Database Persistence with Oracle Fusion Middleware 11g; and Java 7 JAX-WS Web Services.

Books From Packt

JDBC 4.0 and Oracle JDeveloper for J2EE Development
JDBC 4.0 and Oracle JDeveloper for J2EE Development

Mastering Oracle Scheduler in Oracle 11g Databases
Mastering Oracle Scheduler in Oracle 11g Databases

SOA Cookbook
SOA Cookbook

Oracle Web Services Manager
Oracle Web Services Manager

Oracle Modernization Solutions
Oracle Modernization Solutions

SOA and WS-BPEL
SOA and WS-BPEL

Oracle SOA Suite Developer's Guide
Oracle SOA Suite Developer's Guide

PHP Oracle Web Development: Data processing, Security, Caching, XML, Web Services, and Ajax
PHP Oracle Web Development: Data processing, Security, Caching, XML, Web Services, and Ajax

Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software