Reader small image

You're reading from  Instant Pentaho Data Integration Kitchen

Product typeBook
Published inJul 2013
Reading LevelBeginner
PublisherPackt
ISBN-139781849696906
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Sergio Ramazzina
Sergio Ramazzina
author image
Sergio Ramazzina

Sergio Ramazzina is an experienced software architect/trainer with more than 25 years of experience in the IT field. He has worked on a broad number of projects for banks and major Italian companies and has designed complex enterprise solutions in Java, JavaEE, and Ruby. He started using Pentaho products from the very beginning in late 2003. He gained thorough experience by deploying Pentaho as an open source BI solution, standalone or deeply integrated in other applications as the analytical engine of choice. In 2009, due to his experience in the Java/JavaEE world and appreciation for the open source world and its main ideas, he began participating actively as a contributor to some of the Pentaho projects such as JPivot, Saiku, CDF, and CDA and rose to the Pentaho Active Contributor level. At that time, he started participating as a BI architect and Pentaho expert on a wide number of projects where open source BI and Pentaho were the main players. In late 2010, he founded Serasoft, a young Italian consulting firm that specializes in delivering high value open source Business Intelligence solutions. With the team in Serasoft, he shared his passion and experience in designing and delivering highly innovative enterprise solutions to help users make their work more effective. In July 2013, he published his first book, Instant Pentaho Data Integration Kitchen, Packt Publishing. He is also passionate about skiing, tennis, and photography, and he loves his young daughter, Camilla, very much. You can follow him on Twitter at @sramazzina. You can also look at his profile on LinkedIn at http://it.linkedin.com/in/sramazzina/.
Read more about Sergio Ramazzina

Right arrow

The important role of icon and color indicators


Kettle jobs and transformations are full of visual indicators to help us easily and quickly understand why and when the flows follow this path, if the task or step is running, whether it's terminating successfully or not, or other such similar things. Taking care of these indications helps us a lot in understanding if everything has gone, or will go, according to what we intended.

Let's see some examples. While designing a transformation in Spoon, you can see that the connections between the steps (called hops) assume different colors or representation in different situations. The following screenshot shows a Data Validator step with a set of inputs and two outputs:

We can note two interesting things here:

  1. One of the two output hops has been designed with a bold red dotted line, and this line has a red icon with an "x" sign on it. The red dotted line represents a path that carries the lines that caused an error. In this case, the error is related to the failure of the validation rule in the Data Validator step. The red "x" icon represents a graphical sign that enforces the information that this is an error path.

  2. One of the input hops has a circular blue icon with a lowercase "i" character in it. This icon indicates that the hop connects a source step whose rows are an input to configure a parameter of the target step. In this case, the row that comes from the Text File Input step named Territories.txt configures a set of "allowed territories" that must be used to validate the rows that are coming in.

We can summarize the meaning behind the various hop colors as shown in the following table:

Color

Meaning

Green

The hop distributes rows; if multiple hops are leaving a step, rows of data will be distributed evenly to all the target steps

Red

The hop copies rows; if multiple hops are leaving a step, all rows of data will be copied to all the target steps

Yellow

The hop provides information for the step and distributes rows

Gray

The hop is disabled

Black

The hop has a named target step

Blue

Candidate hop using middle button and drag

Red (bold dotted line)

The hop is used for carrying rows that caused errors in the source step(s)

Other kinds of icons can be found while designing transformations in other situations. Let's see another example where a Filter step is going to be used:

As you can see, the Filter step named Filter rows with different countries has two different output hops with two different icons indicated by the two red bold letters A and B:

  • The icon indicated by the bold red letter A means that the hop is followed only when the condition specified in the Filter step has been satisfied (the result of the evaluation is true)

  • The icon indicated by the bold red letter B means that the hop is followed only when the condition specified in the Filter step has not been satisfied (the result of the evaluation is false)

Similar concepts apply to a job's hops. Let's have a look at the following example:

Because the jobs are process orchestrators in this case, a job hop is just an element of flow control. Hops link to job entries, and based on the results of the previous job entry, they determine what happens next. Icons help us to indicate the conditions under which that particular hop is followed:

  • Unconditional – The icon identified by the red bold letter A in the screenshot represents this condition. It specifies that the next job entry will be executed regardless of the result of the originating job entry.

  • Follow when result is true – The icon identified by the red bold letter B in the screenshot represents this condition. It specifies that the next job entry will be executed only when the result of the originating job entry is true.

  • Follow when result is false – The icon identified by the red bold letter C in the screenshot represents this condition. It specifies that the next job entry will be executed only when the result of the originating job entry is false.

Again, hops in jobs assume different colors depending on the properties and the state of the job at that particular point:

  • The hop connecting two tasks is black anytime the target entry is executed regardless of the result of the source entry (unconditional)

  • The hop connecting two tasks is green anytime the target entry is executed if the result of the source entry is successful (result is true)

  • The hop connecting two tasks is black anytime the target entry is executed if the result of the source entry is unsuccessful (result is false)

Previous PageNext Page
You have been reading a chapter from
Instant Pentaho Data Integration Kitchen
Published in: Jul 2013Publisher: PacktISBN-13: 9781849696906
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Sergio Ramazzina

Sergio Ramazzina is an experienced software architect/trainer with more than 25 years of experience in the IT field. He has worked on a broad number of projects for banks and major Italian companies and has designed complex enterprise solutions in Java, JavaEE, and Ruby. He started using Pentaho products from the very beginning in late 2003. He gained thorough experience by deploying Pentaho as an open source BI solution, standalone or deeply integrated in other applications as the analytical engine of choice. In 2009, due to his experience in the Java/JavaEE world and appreciation for the open source world and its main ideas, he began participating actively as a contributor to some of the Pentaho projects such as JPivot, Saiku, CDF, and CDA and rose to the Pentaho Active Contributor level. At that time, he started participating as a BI architect and Pentaho expert on a wide number of projects where open source BI and Pentaho were the main players. In late 2010, he founded Serasoft, a young Italian consulting firm that specializes in delivering high value open source Business Intelligence solutions. With the team in Serasoft, he shared his passion and experience in designing and delivering highly innovative enterprise solutions to help users make their work more effective. In July 2013, he published his first book, Instant Pentaho Data Integration Kitchen, Packt Publishing. He is also passionate about skiing, tennis, and photography, and he loves his young daughter, Camilla, very much. You can follow him on Twitter at @sramazzina. You can also look at his profile on LinkedIn at http://it.linkedin.com/in/sramazzina/.
Read more about Sergio Ramazzina