Pentaho 5.0 Reporting by Example: Beginner's Guide

By Mariano García Mattío , Dario R. Bernabeu
    What do you get with a Packt Subscription?

  • Instant access to this title and 7,500+ eBooks & Videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Free Chapter
    What is Pentaho Report Designer?
About this book

Open source reporting tools and techniques, such as PRD, have been comparable in quality to their commercial counterparts this is largely due to the market's marked tendency to choose open source solutions. PRD is a very powerful tool and in order to take full advantage of it you need to pay attention to the important details.

Pentaho 5.0 Reporting by Example: Beginner’s Guide clearly explains the the foundation and then puts those concepts into practice through step-by-step visual guides. Feeling confident with your newly discovered, desirable, skill you will have the power to create your very own professional reports including graphics, formulas, sub-reports and many other forms of data reporting.

Pentaho 5.0 Reporting By Example: Beginner’s Guide is a step-by-step guide to create high quality, professional reports. Starting with the basics we will explore each feature to ensure a thorough understanding to peel back the curtain and take full advantage of the power that Pentaho puts at our fingertips.

This book gives you the necessary resources to create a great variety of reports. You will be able to make reports that contain sub-reports, include graphics, sparklines and so on. You will also be able to parameterize your reports so that the final user can decide what information to visualize. You will be able to create your own stoplight type indicators and drill down in your reports. and execute your reports from your own web application.

Pentaho 5.0 Reporting By Example: Beginner’s Guide lets you learn everything necessary to work seriously with one of the world’s most popular open source reporting tools. This book will guide you chapter by chapter through examples, graphics, and theoretical explanations so that you feel comfortable interacting with Pentaho Report Designer and creating your own reports.

Publication date:
August 2013


Chapter 1. What is Pentaho Report Designer?

In this chapter, we will explain what Pentaho Report Designer (PRD) is, and we will discuss its engine and its Graphical User Interface (GUI). We will also discuss the advantages that its open source license entails.

We will describe the two most common uses of PRD, which are embedding in Java projects and publishing to the Pentaho BA Server.

We will present the principal types of reports: Transactional Reporting, Tactical Reporting, Strategic Reporting, and Helper Reporting. As we will see throughout this book, PRD supports all of these types of reports.

Later, we will list the main features of PRD; this includes inserting charts (sparklines and JFreeCharts), a variety of export formats, parameterization, style expressions, crosstab reports, interactive reports, Java API, integration with the Pentaho suite, and abstraction layers.

We will make a brief review of the landmarks in the evolution of PRD and its different versions.

At the end of the chapter, we will display a series of PRD reports in order to show the scope of the potential capacities that PRD possesses.

Pentaho Reporting is a technology that allows you to design and build reports for the Pentaho BA platform and other application servers. Pentaho Report Designer (PRD) is a graphics tool that implements the report-editing function. The project from which PRD originated was originally called JFreeReport.

PRD is an open source tool licensed under the GNU Lesser General Public License (GNU LGPL). This license provides the four basic freedoms of free software and the GNU GPL. And the L (Lesser) in LGPL indicates that this software can be used as part of or in combination with proprietary software, which provides greater flexibility for different licenses and software to coexist.


For more information about free software and the GNU project, visit

To read more about the GNU GPL and GNU LGPL licenses, visit

PRD contains a Java-based report engine that provides scalability, portability, and integration. Additionally, the editor's UI is implemented with Swing widgets, which give it a friendly, multiplatform look and feel. This UI is very intuitive, and it allows you to become familiar with the tools quickly.

PRD lets you create simple reports, wizard-based reports, advanced reports, reports with charts, subreports, parameterized reports, and others. Once a report has been created, PRD lets you export it in a variety of formats, such as PDF, Excel, HTML, and CSV, or preview it using Swing.

Since the beginning, PRD has benefitted from multiple contributions from the community, and currently the community supporting this project is growing and becoming more stable and contributing regularly in the form of code, wikis, documentation, forums, bug reports, tutorials, and so on.

PRD has two typical uses as follows:

  • It can be embedded in Java projects in desktop and web applications. In this book, we will develop a good example of how to embed PRD reports in web applications.

  • It can, in a few steps, publish reports to the Pentaho BA Server to be used from there. It can, furthermore, be embedded in other application servers. These points will also be addressed in this book.


Types of reports

There are various categories that reports can be grouped into. From our perspective, the following categories are the most important:

  • Transactional reports: Data for these reports comes from transactions and their objective is to present data at a very detailed and granular level. This type of report is usually used in an organization's day-to-day business. Examples of this kind of report are sales receipts, purchase orders, and so on.

  • Tactical reports: Data for these reports comes from summaries of transactional data. The level of summary is low, usually not more than daily or weekly. This type of report contains information to support short-term decision making. For example, a stock inventory allows us to place orders to replace merchandise.

  • Strategic reports: These reports commonly used data sources clean, reliable and stable, for example of a data warehouse, and their goal is to create business information. This kind of report supports medium and long-term decision making and is usually highly summarized; it allows for parameterization and includes charts and subreports. For example, a seasonal analysis of sales lets us determine what marketing campaigns should be carried out at given periods of time.

  • Helper reports: Data for these reports comes from diverse origins and contains information that may not be normalized, including photos, images, and bar codes. This kind of report is not aimed at supporting decision making but serves a variety of interests. Examples of this kind of report are technical product descriptions, letterheads, ID cards, and so on.


Defining data

Data is an expression that describes some characteristic of an entity. For example, in saying that a box is black, we are specifying data (black) regarding a characteristic (color) of an entity (box).

Defining information

Information is obtained through data processing. Data can be processed through summary, classification, grouping, and ordering.


Main features of Pentaho Report Designer

The following are some of PRD's principal characteristics:

  • Reporting algorithm: PRD avoids compiling reports, a method that other reporting tools use, and combining the report layout with data as it is acquired. Initially, the algorithm calculates and determines how to separate the data into groups, subgroups, and so on, and calculates the height, width, position, and style of the elements (text, images, and so on); later, the data is placed where it belongs in order to obtain the desired output.

  • Diverse data sources: This includes JDBC (this allows access to most databases), Pentaho Metadata, Pentaho Data Integration, OLAP, XML, in-line table, Sequence Generator, Query Scripting, Java Method Invocation, Hibernate, Open ERP, CDA, and so on.

  • Diverse output formats: These are PDF, Excel, Excel 2007, HTML, RTF, CSV, XML, and Text. PRD renders reports with high image quality.

  • Insertable objects: PRD lets you add text fields, labels, images, charts, subreports, shapes, lines, sparklines, hyperlinks, bar codes, and other objects to your reports. Objects are inserted in the UI by simply dragging and dropping.

  • Charts: There are two categories of charts that can be added to reports in PRD; they are as follows:

    • Sparklines: These are inline charts. PRD supports bar, line, and pie sparklines.

    • JFreeChart charts: These are traditional charts. PRD supports bar, line, area, pie, multipie, barline, ring, bubble, scatter-plot, XY-bar, XY-line, XY-area, extended XY-line, waterfall, radar, and XY-area-line charts.

  • Parameterization: PRD allows you to define parameters that can be used in different parts of the report; for example, as a filter for a SQL query, as the text of a label, as part of a formula, and as a style attribute, among others. Regarding the type of presentation, PRD provides the following widgets to supply parameter values: drop-down menus, simple-value lists, multivalue lists, radio buttons, checkboxes, single selection buttons, multiselection buttons, textboxes, text areas, and date pickers.

  • Formulas and style expressions: PRD allows you to assign a style property according to the value of a formula, expression, or fixed value. For example, if the value of the field "quantity" is greater than 50, you can make the background color of a given shape green. You can also use formulas to create new fields and calculate their values.

  • Crosstab report: PRD lets you create powerful reports based on cross tabs using a simple wizard.

  • Interactive reports: PRD lets you add interactivity to your reports, making it possible to expand/collapse groups, add hyperlinks to other reports, and so on.

  • Wizard: PRD includes a wizard that lets you create a report through simple and intuitive steps.

  • Publication: From the PRD UI, you can publish directly to the Pentaho BA Server.

  • Java API: PRD includes a very extensive API that lets you execute, create, and modify reports without using the UI.

  • Extendibility: PRD lets you add new functionality through the incorporation of plugins.

  • Stylesheet support: PRD supports internal storage or external access to CSS3 stylesheets.

  • Integration with the Pentaho suite: PRD is easily integrated with the other tools in the Pentaho suite, including Pentaho Data Integration (PDI) and C-Tools (CDF and CDA).

  • Pentaho Data Integration: PDI implements a transformation step that lets you execute and parameterize PRD reports; this allows you to implement mailing and report bursting, among many other possibilities.

  • Community Dashboard Framework (CDF): This implements a component for working with PRD reports. Community Data Access (CDA) can be added as a plugin PRD, with the goal that our reports can get their data from a CDA connection.

  • Abstraction layers: When an end user creates a report, they usually see the process as a whole. The process of creating a report in PRD can be separated into three parts: the selection of data source to use, the design/layout of the report, and the final presentation format. PRD defines two layers of abstraction between these three parties, allowing each party to become independent of the other and the report to be adapted to different contexts and needs. For example, if a report is created based on a data source developed in MySQL and is later published to a Pentaho BA Server connected to a PostgreSQL production database, the report will be adapted to this context and will be able to be executed without change.

The following illustration demonstrates before to publish in Pentaho BA Server using MySQL and a presentation in PDF:

The following diagram illustrates after to publish in Pentaho BA Server using PostgreSQL and a presentation in HTML:


Landmarks in the evolution of PRD are as follows:

  • 2002: David Gilbert (author of JFreeChart) implements the first version of JFreeReport. Soon after, Thomas Morgner becomes the lead developer. The JFreeReport project is very successful and many people begin to contribute to it.

  • 2006: Thomas joins Pentaho, changing the name to Pentaho Reporting. Thomas becomes a developer for the Pentaho Reporting Engine and other suite tools.

  • January 2006: Pentaho announces that Pentaho Report Designer (PRD) Wizard is available for creating reports. Mike D'Amour becomes the initial author of this wizard.

  • June 2006: Martin Schmid contributes the first version of PRD to the community. From this point forward, the designer is developed in parallel with the engine.

  • November 2006: (BA Server, Version 1.6) Pentaho Reporting is integrated with the Pentaho Metadata Engine to create ad hoc reports.

  • April 2007: The Pentaho team joins to distribute a reports solution for the OOo database tool. This project is led by Thomas Morgner and comes to be known as the Pentaho Reporting Flow Engine.

  • December 2007: The development of version 1.0 of the PRD Classic Engine begins.

  • March 2008: A native data source is added for Pentaho Data Integration.

  • June 2008: The Olap4j data source is added; it provides connectivity to Mondrian and other OLAP sources and the possibility of executing MDX queries. Implementing cross tabs is discussed.

  • June 2009: PRD supports Rich Text Format (RTF) and script-based data sources are added (Scriptable Datasource).

  • June 2010: The Table of Contents component is added.

  • October 2010: The Drill Linking characteristic is added.

  • November 2010: Execution environment information is available via ENV Fields.

  • December 2010: The development of the Community Data Access (CDA) data source begins.

  • January 2011: PRD cache development begins.

  • July 2011: Sparklines are added.

  • November 2011: Version Checker is added.

  • April 2011: 10 years since the creation of JFreeReports.

The evolution of different versions of PRD can be seen in SourceForge via the link as follows:

  • 2007 – November: Version 1.6

  • 2008 – June: Version 1.7

  • 2009 – January: Version 2.0

  • 2009 – August: Version 3.0

  • 2009 – November: Version 3.5

  • 2010 – March: Version 3.6

  • 2010 – December: Version 3.7

  • 2011 – March: Version 3.8

  • 2012 – May: Version 3.9

  • 2013 – Version 5.0


Examples of typical reports

In the following sections, we present a series of PRD reports typically included as examples of Pentaho solutions and in the sample reports of PRD.

The buyer report

The buyer report is found at the following location in the top menu:

Help | Sample Reports | Operational Reports | Buyer Report

This report presents an analysis of the products belonging to each product line grouped by the vendor. For each product, it shows a sparkline with information about the sales of this product in the last three years. This report allows the end user to select the product line to be displayed.

As can be seen in the top part of the report, there are two selectors for passing parameters to the report:

The selectors will then show a report of the product line (Choose a Line:) indicated in the parameter, in this case Trains, and its visualization format will be the one chosen in Output Type, in this case HTML (Paginated).

To show information about each product, this report uses sparklines, text, numbers, and monetary values (in dollars):

Also, an image has been placed in the report title (see the first screenshot of this section).

The income statement

The income statement report is found at the following location in the top menu:

Help | Sample Reports | Financial Reports | Income Statement

This report shows the current income statement of the company and shows calculations of totals and subtotals, grouping each item in its respective category (Revenue, Cost of Goods, and so on).The default for the Output Type parameter value has been set to PDF, so when it is executed, it will download the file Income Statement.pdf. If you prefer to view the report in another format, you only have to change the value of the Output Type parameter.

This report has an image as a background.

The inventory list

The inventory list report is found at the following location in the top menu:

Help | Sample Reports | Operational Reports | Inventory List

This report shows information about the stock of products and allows the end user to select one or more product lines that need to be shown.

This report uses bar codes:

It uses hyperlinks (drill through) to other reports:

If you click on the values in the field SKU, a new tab will open displaying a report with additional information about this SKU. In this case, if we click on S12_1099, a report shown in the following screenshot will appear:

It also uses links of web pages:

If you click on the values in the field Name, a new web browser tab opens with a web page that searches for Google Images.

This report uses conditional formatting, that is, the background color of the field On Hand is determined according to certain values and conditions. This allows for the viewing of the statuses of the reports through indicators in a very simple way, that is, the indicator is green when the values are good, yellow when they are acceptable, and red when they are bad.


The invoice report is found at the following location in the top menu:

Help | Sample Reports | Production Reports | Invoice

This report shows all the invoices that have been issued to customers. Each invoice is presented on a separate page, and the end user has the ability to select the client they want to analyze.

This report, showing Payment History , uses sub-reports.

Product Sales

The product sales report is found at the following location in the top menu:

Help | Sample Reports | Oper ational Reports | Product Sales

This report shows information about sales made to the customers. On the left-hand side, a pie chart shows the percentage of sales made to each customer and on the right-hand side, a sub-report shows a list of the amount of sales in detail.

It is important to point out that the parameter selectors are related, that is, once a value is chosen for Line, only products belonging to that line will be shown in the options of the selector Product:

Top N Customers

This report is found at the following location in the top menu:

Help | Sample Reports | Operational Reports | Top N Customers

This report shows an analysis of the top N customers; here N can be defined via a parameter. It also displays information amount and percentage of the total sales, and uses a table and two bar charts.

HTML actions

This report is found at the following location in the top menu:

Help | Sample R eports | Advanced | HTML Actions

This report shows a list of products grouped by the vendor and provides information about their code, name, purchase price, and so on. Furthermore, it lets us expand or collapse the different groups. That is, if you click on + Autoart Studio Design, all the child nodes collapse.

This report lets us search for the models of the cars we choose in Google Images. For example, choose 1966 Shelby Cobra 427 S/C as shown in the following screenshot:

Click on the get values button. You will see that all the auto models will be shown in the following screenshot:

Click on the Call Web Service button; a new web browser window will open with a web page that searches for Google Web.



In this chapter, we saw that Pentaho Report Designer (PRD) is a very powerful tool for report editing and that it is licensed under the GNU LGPL. We also saw that it has a Java-based report engine and that the editor's UI is implemented with Swing widgets.

We saw that it can be embedded in Java projects in desktop and web applications, and that it can publish reports to the Pentaho BA Server.

We discussed the different types of reports, such as Transactional Reports and Tactical Reports, and their principal characteristics. We also discussed that what makes each of these reports different from others is their data source, their granularity, and their final goal.

We detailed the principal characteristics of PRD through which we were able to show its robustness, flexibility, and interactivity.

We listed each landmark in the development of PRD, following its evolution with each new contribution and functionality. We also showed in a small graph how PRD versions advanced over time.

By the end of this chapter, we are able to appreciate the quantity and variety of reports that PRD lets us create.

In the next chapter we will look at the system requirements to be able to execute PRD. We will download, install, and configure PRD 5.0. Furthermore, we will download and install a database sample.

About the Authors
  • Mariano García Mattío

    Mariano García Mattío is a systems engineer for the IUA and specialist in distributed systems and services for the Facultad de Matemática Astronomía y Física (Faculty of Mathematics Astronomy and Physics) FaMAF UNC. He is an associate professor of: databases 1, databases 2, and advance database systems at the IUA, school of engineering; database engines at the IUA, school of administration; object-oriented programming paradigm, and distributed systems at the IUA's master in embedded systems. He is the teacher in charge of assignments for applied databases at the UCC. Also, Mariano is the co-director of the research project on new information and communication technologies at the UCC and co-director of the research project on networks monitoring and communication systems at the IUA. He is also a member of the Virtual Laboratories research project at the IUA and co-founder of eGluBI. He is the coordinator of the social network Open BI Network. He specializes in Java SE and Java EE technologies, node.js, administration and design of databases, and OSBI. His blog site is

    Browse publications by this author
  • Dario R. Bernabeu

    Dario R. Bernabeu is a systems engineer at the Instituto Universitario Aeronáutico (University Aeronautic Institute) IUA. He is the co-founder of eGluBI ( He specializes in development and implementation of OSBI solutions (Open Source Business Intelligence), project management, analysis of requirements/needs, deployment and configuration of BI solutions, design of data integration processes, data warehouse modelling, design of multidimensional cubes and business models, development of ad hoc reports, advanced reports, interactive analysis, dashboards, and so on. A teacher, researcher, geek, and open source software enthusiast, his most notable publication is "Data Warehousing: Research and Concept Systematization – HEFESTO: Methodology for the Construction of a DW". Being the coordinator of the social network Open BI Network (, he makes many contributions to various forums, wikis, blogs, and so on. You can find his blog site at

    Browse publications by this author
Latest Reviews (1 reviews total)
I needed to get on-board with some new technology before my job interview.. why did I chose PACKT, simple .. the book was online ...
Pentaho 5.0 Reporting by Example: Beginner's Guide
Unlock this book and the full library FREE for 7 days
Start now