Reader small image

You're reading from  Pentaho Data Integration Quick Start Guide

Product typeBook
Published inAug 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781789343328
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
María Carina Roldán
María Carina Roldán
author image
María Carina Roldán

María Carina Roldán was born in Argentina and has a bachelor's degree in computer science. She started working with Pentaho back in 2006. She spent all these years developing BI solutions, mainly as an ETL specialist, and working for different companies around the world. Currently, she lives in Buenos Aires and works as an independent consultant. Carina is the author of Learning Pentaho Data Integration 8 CE, published by Packt in December 2017. She has also authored other books on Pentaho, all of them published by Packt.
Read more about María Carina Roldán

Right arrow

Filtering rows


Until now, we have been enriching our dataset with new data. Now we will do the exact opposite: we will discard unwanted information. We already know how to keep a subset of fields and discard the rest: We do it by using the Select values step. Now it's time to keep only the rows that we are interested on.

Filtering rows upon conditions

To demonstrate how to filter rows with PDI, we will work again with the survey files. This time, we will read a set of files, and will keep only the locations with more than three rooms. The main step we will be using is the Filter rows step. Go through the following steps:

  1. Create a transformation and use a Text file input step to read the files containing the surveys carried in 2015.

Note

You are free to read a different set of files, but if you read this set, you will be able to compare your results with the results shown in the following screenshots.

  1. After the Text file input step, add a Filter rows step. You will find it in the Flow folder.
  2. In...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Pentaho Data Integration Quick Start Guide
Published in: Aug 2018Publisher: PacktISBN-13: 9781789343328

Author (1)

author image
María Carina Roldán

María Carina Roldán was born in Argentina and has a bachelor's degree in computer science. She started working with Pentaho back in 2006. She spent all these years developing BI solutions, mainly as an ETL specialist, and working for different companies around the world. Currently, she lives in Buenos Aires and works as an independent consultant. Carina is the author of Learning Pentaho Data Integration 8 CE, published by Packt in December 2017. She has also authored other books on Pentaho, all of them published by Packt.
Read more about María Carina Roldán