Converting XML to PDF

Create, validate, and transform XML documents with Oracle's JDeveloper IDE using this book and eBook

XML is the most suitable format for data exchange, but not for data presentation. Adobe's PDF and Microsoft Excel's spreadsheet are the commonly used formats for data presentation. If you receive an XML file containing data that needs to be included in a PDF or Excel spreadsheet file, you need to convert the XML file to the relevant format. Some of the commonly used XML-to-PDF conversion tools/APIs are discussed in the following table:

Tool/API

Description

iText

iText is a Java library for generating a PDF document. iText may be downloaded from http://www.lowagie.com/iText/.

Stylus Studio XML Publisher

XML Publisher is a report designer, which supports many data sources, including XML, to generate PDF reports.

Stylus Studio XML Editor

XML Editor supports XML conversion to PDF.

Apache FOP

The Apache project provides an open source FO processor called Apache FOP to render an XSL-FO document as a PDF document. We will discuss the Apache FOP processor in this chapter.

XMLMill for Java

XMLMill may be used to generate PDF documents from XML data combined with XSL and XSLT. XMLMill may be downloaded from http://www.xmlmill.com/.

RenderX XEP

RenderX provides an XSL-FO processor called XEP that may be used to generate a PDF document.

 

We can convert an XML file data to a PDF document using any of these tools/APIs in JDeveloper 11g.

Here, we will use Apache FOP API.

The XSL specification consists of two components: a language for transforming XML documents (XSLT), and XML syntax for specifying formatting objects (XSL-FO). Using XSL-FO, the layout, fonts, and representations of the data may be formatted. Apache FOP (Formatting Objects Processor) is a print formatter for converting XSL formatting objects (XSL-FO) to an output format such as PDF, PCL, PS, SVG, XML, Print, AWT, MIF, or TXT. In this  article, we will convert an XML document to PDF using XSL-FO and the FOP processor in Oracle JDeveloper 11g.The procedure to create a PDF document from an XML file using the Apache FOP processor in JDeveloper is as follows:

  1. Create an XML document.
  2. Create an XSL stylesheet.
  3. Convert the XML document to an XSL-FO document.
  4. Convert the XSL-FO document to a PDF file.

Setting the environment

We need to download the FOP JAR file fop-0.20.5-bin.zip (or a later version) from http://archive.apache.org/dist/xmlgraphics/fop/binaries/ and extract the ZIP file to a directory. To develop an XML-to-PDF conversion application, we need to create an application (ApacheFOP, for example) and a project (ApacheFOP for example) in JDeveloper. In the project add an XML document, catalog.xml, with File | New. In the New Gallery window select Categories | General | XML and Items | XML Document. Click on OK. In the Create XML File window specify a File Name, catalog.xml, and click on OK. A catalog.xml file gets added to the ApacheFOP project. Copy the following catalog.xml listing to catalog.xml:

<?xml version="1.0" encoding="UTF-8"?>
<catalog title="Oracle Magazine" publisher="Oracle Publishing">
<journal edition="September-October 2008">
<article>
<title>Share 2.0</title>
<author>Alan Joch</author>
</article>
<article>
<title>Restrictions Apply</title>
<author>Alan Joch</author>
</article>
</journal>
<journal edition="March-April 2008">
<article>
<title>Oracle Database 11g Redux</title>
<author>Tom Kyte</author>
</article>
<article>
<title>Declarative Data Filtering</title>
<author>Steve Muench</author>
</article>
</journal>
</catalog>

We also need to add an XSL stylesheet to convert the XML document to an XSL-FO document. Create an XSL stylesheet with File | New. In the New Gallery window, select Categories | General | XML and Items | XSL Stylesheet. Click on OK. In the Create XSL File window specify a File Name (catalog.xsl) and click on OK. A catalog.xsl file gets added to the ApacheFOP project. To convert the XML document to an XSL-FO document and subsequently create a PDF file from the XSL-FO file, we need a Java application. Add a Java class,  XMLToPDF.java, with File | New. In the New Gallery window select Categories | General and Items | Java Class. Click on OK. In the Create Java Class window specify a class Name (XMLToPDF for example) and click on OK. A Java class gets added to the ApacheFOP project. The directory structure of the FOP application is shown in the following illustration:

Converting XML to PDF

Next, add the FOP JAR files to the project. Select the project node (ApacheFOP node) and then Tools | Project Properties. In the Project Properties window, select Libraries and Classpath. Add the Oracle XML Parser v2 library with the  Add Library button. The JAR files required to develop an FOP application are listed in the following table:

JAR File

Description

<FOP>/fop-0.20.5/build/fop.jar

Apache FOP API

<FOP>/fop-0.20.5/lib/batik.jar

Graphics classes

<FOP>/fop-0.20.5/lib/ avalon-framework-cvs-20020806.jar

Logger classes

<FOP>/fop-0.20.5/lib/ xercesImpl-2.2.1.jar

The DOMParser and the SAXParser classes

The variable is the directory in which Apache FOP is installed. Add the JAR files with the Add JAR/Directory button. Click on  OK in the Project Properties window.

Converting XML to PDF

Converting XML to XSL-FO

In this section, we will convert the example XML document (catalog.xml) to an XSL-FO document. An XSL-FO document includes formatting information about the data to be presented. It includes the layout, fonts, and tables in the document. An XSL-FO document is created in the fo prefix namespace using the namespace declaration xmlns:fo=http://www.w3.org/1999/XSL/Format. The root element of the XSL-FO document is fo:root. The XSL-FO document elements are based on the fo.dtd DTD, which may be downloaded from http://www.syntext.com/products/dtd2xs/doc/fo.dtd. Some of the commonly used elements in an XSL-FO document are listed here:

The example XML document to be converted to a PDF document consists of a journal catalog. The XSLT stylesheet catalog.xsl, from which the example XML document is converted to an XSL-FO document, is listed as follows:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fo="http://www.w3.org/1999/XSL/Format" exclude-result-
prefixes="fo">
<xsl:output method="xml" version="1.0" omit-xml-declaration="no"
indent="yes"/>
<!-- ========================= -->
<!-- root element: catalog -->
<!-- ========================= -->
<xsl:template match="/catalog">
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
<fo:layout-master-set>
<fo:simple-page-master master-name="simpleA4" page-
height="29.7cm"
page-width="21cm" margin-top="2cm" margin-bottom="2cm"
margin-left="2cm" margin-right="2cm">
<fo:region-body/>
</fo:simple-page-master>
</fo:layout-master-set>
<fo:page-sequence master-reference="simpleA4">
<fo:flow flow-name="xsl-region-body">
<fo:block font-size="16pt" font-weight="bold" space-
after="5mm">
Catalog: <xsl:value-of select="@title"/>
</fo:block>
<fo:block font-size="16pt" font-weight="bold" space-after="5mm">
Publisher: <xsl:value-of select="@publisher"/>
</fo:block>
<fo:block font-size="10pt">
<fo:table table-layout="fixed">
<fo:table-column column-width="4cm"/>
<fo:table-column column-width="4cm"/>
<fo:table-column column-width="5cm"/>
<fo:table-header>
<fo:table-row font-weight="bold"><fo:table-cell>
<fo:block>
<xsl:text>Edition</xsl:text>
</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>
<xsl:text>Title</xsl:text>
</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>
<xsl:text>Author</xsl:text>
</fo:block>
</fo:table-cell>
</fo:table-row>
</fo:table-header>
<fo:table-body>
<xsl:apply-templates select="journal"/>
</fo:table-body>
</fo:table>
</fo:block>
</fo:flow>
</fo:page-sequence>
</fo:root>
</xsl:template>
<xsl:template match="journal">
<xsl:for-each select="article">
<fo:table-row>
<fo:table-cell>
<fo:block>
<xsl:value-of select="../@edition"/>
</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>
<xsl:value-of select="title"/>
</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>
<xsl:value-of select="author"/>
</fo:block>
</fo:table-cell>
</fo:table-row>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>

Copy the catalog.xsl listing to the catalog.xsl file in the JDeveloper project in the Application Navigator. Next, we will convert the example XML document to an XSL-FO document in the Java application XMLToPDF.java.

Parsing the XML document

Create a DocumentBuilderFactory object using the static method newIntance(). The factory class is used to create a DocumentBuilder parser.

DocumentBuilderFactory factory =DocumentBuilderFactory.newInstance();

Create a DocumentBuilder parser from the DocumentBuilderFactory object using the newDocumentBuilder() method.

DocumentBuilder builder = factory.newDocumentBuilder();

Parse the example XML document using one of the overloaded parse() methods.

File xmlFile = new File("catalog.xml");
Document document = builder.parse(xmlFile);

Generating the XSL-FO document

Create a TransformerFactory object using the static method newInstance(). The factory class is used to create a Transformer object.

TransformerFactory transformerFactory = TransformerFactory.newInstance();

Create a Transformer object from the TransformerFactory object using the newTransformer() method.

File stylesheet = new File("catalog.xsl");
Transformer transformer = transformerFactory.newTransformer(new StreamSource(stylesheet));

Transform the Document object obtained from the example XML document using the method transform(Source, Result). The input XML document may be specified as a DOMSource, SAXSource, or StreamSource object. The transformation output may be specified as DOMResult, SAXResult, or StreamResult.

DOMSource source = new DOMSource(document);
StreamResult result = new StreamResult(new File("catalog.fo"));
transformer.transform(source, result);

The XSL-FO document, catalog.fo, generated from the example XML document is listed as follows:

<?xml version = '1.0' encoding = 'UTF-8'?>
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
<fo:layout-master-set>
<fo:simple-page-master master-name="simpleA4" page-
height="29.7cm" page-width="21cm" margin-top="2cm"
margin-bottom="2cm" margin-left="2cm" margin-right="2cm">
<fo:region-body/>
</fo:simple-page-master>
</fo:layout-master-set>
<fo:page-sequence master-reference="simpleA4">
<fo:flow flow-name="xsl-region-body">
<fo:block font-size="16pt" font-weight="bold" space-
after="5mm">
Catalog: Oracle Magazine</fo:block>
<fo:block font-size="16pt" font-weight="bold" space-
after="5mm">
Publisher: Oracle Publishing</fo:block>
<fo:block font-size="10pt">
<fo:table table-layout="fixed">
<fo:table-column column-width="4cm"/>
<fo:table-column column-width="4cm"/>
<fo:table-column column-width="5cm"/>
<fo:table-header>
<fo:table-row font-weight="bold">
<fo:table-cell>
<fo:block>Edition</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>Title</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>Author</fo:block>
</fo:table-cell>
</fo:table-row>
</fo:table-header>
<fo:table-body>
<fo:table-row>
<fo:table-cell>
<fo:block>July-August 2005</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>Tuning Undo Tablespace </fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>
Kimberly Floss
</fo:block>
</fo:table-cell>
</fo:table-row>
<fo:table-row>
<fo:table-cell>
<fo:block>July-August 2005</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>Browsing and Editing Data</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>
Steve Muench
</fo:block>
</fo:table-cell>
</fo:table-row>
<fo:table-row>
<fo:table-cell>
<fo:block>September-October 2005</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>Sharing Memory—Automatically</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>Kimberly Floss </fo:block>
</fo:table-cell>
</fo:table-row>
<fo:table-row>
<fo:table-cell>
<fo:block>September-October 2005</fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>Creating Search Pages </fo:block>
</fo:table-cell>
<fo:table-cell>
<fo:block>
Steve Muench
</fo:block>
</fo:table-cell>
</fo:table-row>
</fo:table-body>
</fo:table>
</fo:block>
</fo:flow>
</fo:page-sequence>
</fo:root>

Converting XSL-FO to PDF

In this section, we will convert the XSL-FO document generated in the previous section to a PDF document using the FOP driver. In the Java application  XMLToPDF.java, import the FOP Driver class and the logger classes required to convert an XML document to a PDF document.

import org.apache.fop.apps.Driver;
import org.apache.avalon.framework.logger.Logger;
import org.apache.avalon.framework.logger.ConsoleLogger;

Creating the FOP driver

Create an FOP driver object using the constructor for the Driver class.

Driver driver=new Driver();

Create a logger with level setting LEVEL_INFO  using the ConsoleLogger constructor.

Logger logger=new ConsoleLogger(ConsoleLogger.LEVEL_INFO);

Set the logger on the FOP driver using the setLogger method, and set the logger on the MessageHandler using the setScreenLogger method.

driver.setLogger(logger);
org.apache.fop.messaging.MessageHandler.setScreenLogger(logger);

Set the renderer for the FOP driver using the setRenderer method. For conversion to a PDF file, specify Driver.RENDER_PDF as the renderer.

driver.setRenderer(Driver.RENDER_PDF);

Generating the PDF document

Specify the XSL-FO document that is to be converted to a PDF document. The XSL-FO document, which was generated from an XML document previously in this article, is used to generate a PDF document. An XSL-FO object is set on a Driver object using the setInputSource(InputSource) method.

File xslFOFile=new File("catalog.fo");
InputStream input=new FileInputStream(xslFOFile);
driver.setInputSource(new InputSource(input));

Specify an output PDF document using the setOutputStream(OutputStream) method.

File pdfFile=new File("catalog.pdf");
OutputStream output=new FileOutputStream(pdfFile);
driver.setOutputStream(output);

Run the FOP driver to generate a PDF document using the run method.

driver.run();

Running the Java application

The Java application, XMLToPDF.java, used for converting an XML document to a PDF document is listed here with explanations:

  1. First, we add the package and import statements.
      package apachefop;
      import org.apache.fop.apps.Driver;
      import java.io.*;
      import org.apache.avalon.framework.logger.ConsoleLogger;
      import org.apache.avalon.framework.logger.Logger;
      import org.apache.fop.apps.FOPException;
      import org.xml.sax.InputSource;
      import javax.xml.transform.*;
      import javax.xml.transform.stream.StreamSource;
      import javax.xml.transform.stream.StreamResult;
      import javax.xml.parsers.DocumentBuilder;
      import javax.xml.parsers.DocumentBuilderFactory;
      import javax.xml.parsers.ParserConfigurationException;
      import javax.xml.transform.dom.DOMSource;
      import org.xml.sax.SAXException;
      import org.w3c.dom.Document;
  2. We define the Java class XMLToPDF
      public class XMLToPDF
      {
      public XMLToPDF()
      {
      }
  3. Next, we add the Java method xmlToFO to convert an XML document to an XSL-FO document.
      public void xmlToFO(){
      try{
      DocumentBuilderFactory factory =
      DocumentBuilderFactory.newInstance();
      File stylesheet = new File("catalog.xsl");
      File xmlFile = new File("catalog.xml");
      DocumentBuilder builder = factory.newDocumentBuilder();
      Document document = builder.parse(xmlFile);
      TransformerFactory transformerFactory = TransformerFactory.newInstance();
      Transformer transformer = transformerFactory.newTransformer(new StreamSource(stylesheet));
      DOMSource source = new DOMSource(document);
      StreamResult result = new StreamResult(new File("catalog.fo"));
      transformer.transform(source, result);
      }catch(TransformerConfigurationException e)
      {System.err.println("TransformerConfigurationException: "+e.getMessage());}
      catch(TransformerException e)
      {System.err.println("TransformerException: "+e.getMessage());}
      catch(ParserConfigurationException e)
      {System.err.println("TransformerConfigurationException: "+e.getMessage());}
      catch(IOException e){System.err.println("TransformerException: "+e.getMessage());}
      catch(SAXException e){System.err.println("TransformerException: "+e.getMessage());}
      }
  4. We add the Java method foToPDF to convert a XSL-FO document to a PDF document.
      public void foToPDF(){
      try{
      Driver driver=new Driver();
      Logger logger=new ConsoleLogger(ConsoleLogger.LEVEL_INFO);
      driver.setLogger(logger);
      org.apache.fop.messaging.MessageHandler
      .setScreenLogger(logger);
      ?driver.setRenderer(Driver.RENDER_PDF);
      File xslFOFile=new File("catalog.fo");
      File pdfFile=new File("catalog.pdf");
      InputStream input=new FileInputStream(xslFOFile);
      driver.setInputSource(new InputSource(input));
      OutputStream output=new FileOutputStream(pdfFile);
      driver.setOutputStream(output);
      driver.run();
      output.flush();
      output.close();
      }catch(IOException e){System.err.println(„IOException: „+e.getMessage());}
      catch(FOPException e){System.err.println(„FOPException: "+e.getMessage());}
      }
  5. Finally, we add the main method in which we create an instance of the XMLToPDF class and invoke the xmlToFO and fopToPDF methods.
      public static void main(String[] argv){
      XMLToPDF fop=new XMLToPDF();
      fop.xmlToFO();
      fop.foToPDF();
      }
      }

Copy the XMLToPDF.java listing to the XMLToPDF.java class in the JDeveloper project ApacheFOP. To run the Java application, right-click on the Java application node in the Application Navigator, and select Run.

Converting XML to PDF

The output from the application run indicates that org.pache.xerces.parsers.SAXParser is used to parse the XML document. A formatting object gets built, the fonts get set, and a PDF document gets generated. Select View | Refresh to add the catalog.pdf document generated from the example XML document catalog.xml to the ApacheFOP project.

Converting XML to PDF

The catalog.pdf document generated is shown as follows:

Converting XML to PDF

We did not include any graphics within the PDF generated, but Apache FOP supports the inclusion of graphics within PDF documents. Refer to http://xmlgraphics.apache.org/fop/0.94/graphics.html  for graphics support in Apache FOP.

Summary

In this article, we successfully converted an example XML document, catalog.xml, to a PDF document, catalog.pdf, using the Apache FOP driver in JDeveloper 11g.

Books to Consider

comments powered by Disqus
X

An Introduction to 3D Printing

Explore the future of manufacturing and design  - read our guide to 3d printing for free