Getting Started with Talend Open Studio for Data Integration

Getting Started with Talend Open Studio for Data Integration
eBook: $26.99
Formats: PDF, PacktLib, ePub and Mobi formats
save 15%!
Print + free eBook + free PacktLib access to the book: $71.98    Print cover: $44.99
save 37%!
Free Shipping!
UK, US, Europe and selected countries in Asia.
Also available on:
Table of Contents
Sample Chapters
  • Develop complex integration jobs without writing code
  • Go beyond “extract, transform and load” by constructing end-to-end integrations
  • Learn how to package your jobs for production use

Book Details

Language : English
Paperback : 320 pages [ 235mm x 191mm ]
Release Date : November 2012
ISBN : 1849514720
ISBN 13 : 9781849514729
Author(s) : Jonathan Bowen
Topics and Technologies : All Books, Big Data and Business Intelligence, Data, Open Source

Table of Contents

Chapter 1: Knowing Talend Open Studio
Chapter 2: Working with Talend Open Studio
Chapter 3: Transforming Files
Chapter 4: Working with Databases
Chapter 5: Filtering, Sorting, and Other Processing Techniques
Chapter 6: Managing Files
Chapter 7: Job Orchestration
Chapter 8: Managing Jobs
Chapter 9: Global Variables and Contexts
Chapter 10: Worked Examples
Appendix A: Installing Sample Jobs and Data
Appendix B: Resources
  • Chapter 1: Knowing Talend Open Studio
    • What Talend Open Studio is
      • Use cases
      • History of Talend Open Studio
      • Benefits of Talend Open Studio
    • Installing Talend Open Studio
      • Prerequisites
      • Installation guide
    • Other useful software
      • Text editor
      • MySQL
    • Sample jobs and data
    • Summary
    • Chapter 2: Working with Talend Open Studio
      • Studio definitions
      • Starting the Studio
      • Tour of the Studio
        • The Repository
        • The design workspace
        • The Palette
        • Configuration tabs
        • Outline and Code panels
      • Creating a new project
      • Creating an example job
      • Metadata
      • Summary
      • Chapter 3: Transforming Files
        • Transforming XML to CSV
        • Transforming CSV to XML
        • Maps and expressions
        • Advanced XML output for complex XML structures
        • Working with multi-schema XML files
        • Enriching data with lookups
        • Extracting data from Excel files
          • Extracting data from multiple sheets
          • Joining data from multiple sheets
        • Summary
        • Chapter 4: Working with Databases
          • Database metadata
          • Extracting data from a database
          • Extracts from multiple tables
            • Joining within the database component
            • Joining outside the database component
          • Writing data to a database
          • Database to database transfer
          • Modifying data in a database
          • Dynamic database lookup
          • Summary
            • Chapter 6: Managing Files
              • Managing local files
                • Copying files
                • Copying and removing files
                • Renaming files
                • Deleting files
                • Timestamping a file
                • Listing files in a directory
                • Checking for files
                • Archiving and unarchiving files
              • FTP file operations
                • FTP Metadata
                • FTP Put
                • FTP Get
                • FTP File Exist
                • FTP File List and Rename
                • Deleting files on an FTP server
              • Summary
              • Chapter 7: Job Orchestration
                • What is a subjob
                • A simple subjob
                • On Subjob Error
                • On Component OK
                • Run If
                • Jobs as subjobs
                • Iterating and looping
                  • Iterate connections
                  • ForEach loop
                  • Loop "n" times
                  • Infinite loop
                • Duplicating and merging dataflows
                  • Duplicating data
                  • Merging data
                • Summary
                • Chapter 8: Managing Jobs
                  • Job versions
                  • Exporting and importing jobs
                    • Exporting jobs
                      • Exporting a project
                      • Exporting a job
                      • Exporting a job for execution
                    • Importing jobs
                      • Importing a project
                      • Importing a job
                  • Scheduling jobs
                  • Summary
                  • Chapter 9: Global Variables and Contexts
                    • Global variables
                      • Studio global variables
                      • User defined global variables
                    • Contexts
                      • Embedded context variables
                      • Repository context variables
                      • External context variables
                      • Complex context variables
                      • Using embedded, repository, and external contexts
                    • Summary
                    • Chapter 10: Worked Examples
                      • Product catalog
                        • Data import from the ERP system
                        • Data import from Fabric Fashions
                        • Data import from Runway Collections
                      • Product inventory data
                      • Order file processing
                      • Order status updates
                      • Automating processes
                        • E-mailing daily sales
                        • Automating product visibility
                      • Summary

                          Jonathan Bowen

                          Jonathan Bowen is an E-commerce and Retail Systems Consultant and has worked in and around the retail industry for the past 20 years. His early career was in retail operations, then in the late 1990s he switched to the back office and has been integrating and implementing retail systems ever since. Since 2006, he has worked for one of the UK’s largest e-commerce platform vendors as Head of Projects and, later, Head of Product Strategy. In that time he has worked on over 30 major e-commerce implementations. Outside of work, Jonathan, like many parents, has a busy schedule of sporting events, music lessons, and parties to take his kids to, and any downtime is often spent catching up with the latest tech news or trying to record electronic music in his home studio. You can get in touch with Jonathan at his website:
                          Sorry, we don't have any reviews for this title yet.

                          Code Downloads

                          Download the code and support files for this book.

                          Submit Errata

                          Please let us know if you have found any errors not listed on this list by completing our errata submission form. Our editors will check them and add them to this list. Thank you.


                          - 3 submitted: last submission 22 Aug 2013

                          The code for this book has been re-uploaded.

                          Errata type: Technical | Page number: 41

                          By default, this is configured to 50, but let's change this to 0, which is the number used to configure no limit.
                          Should be: By default, this is configured to 50, but let's change this to -1, which is the number used to configure no limit.

                          Errata type: Technical | Page number: 61

                          Chapter 3, Page 61, Bullet point 1: In this bullet point, the exercise is configuring the output so that the ID field (of the output) is the customer ID of the input, left-padded with zeros. In order for the left-padding format to be allowable, the data type of the output ID must be set to “string” and not stay as “integer”, which is its original type.

                          Sample chapters

                          You can view our sample chapters and prefaces of this title on PacktLib or download sample chapters in PDF format.

                          Frequently bought together

                          Getting Started with Talend Open Studio for Data Integration +    Oracle Goldengate 11g Complete Cookbook =
                          50% Off
                          the second eBook
                          Price for both: $49.20

                          Buy both these recommended eBooks together and get 50% off the cheapest eBook.

                          What you will learn from this book

                          • How to transform data files from one format to another
                          • Getting data in and out of a relational database
                          • Using common data operations such as filtering, sorting and aggregating
                          • Managing files – moving, copying, renaming and deleting
                          • Adding flow logic to integration jobs, including “if/then” operations and sequence dependencies
                          • How to use dynamic variables, avoiding hard-coded routines
                          • Using TOS in real-life scenarios with lots of tips and tricks
                          • Learn how to integrate data to and from many different sources

                          In Detail

                          Talend Open Studio for Data Integration (TOS) is an open source graphical development environment for creating custom integrations between systems. It comes with over 600 pre-built connectors that make it quick and easy to connect databases, transform files, load data, move, copy and rename files and connect individual components in order to define complex integration processes.

                          "Getting Started with Talend Open Studio for Data Integration" illustrates common uses and scenarios in a simple, practical manner and, building on knowledge as the book progresses, works towards more complex integration solutions.

                          TOS is a code generator and so does a lot of the “heavy lifting” for you. As such, it is a suitable tool for experienced developers and non-developers alike. You'll start by learning how to construct some common integrations tasks – transforming files and extracting data from a database, for example. These building blocks form a “toolkit” of techniques that you will learn how to apply in many different situations.

                          By the end of the book, once complex integrations will appear easy and you will be your organization’s integration expert!

                          Best of all, TOS makes integrating systems fun!


                          "Getting Started with Talend Open Studio for Data Integration" takes a step-by-step, hands-on approach to learning with lots of examples and clear instructions.

                          Who this book is for

                          Are you a developer, business analyst, project manager, business intelligence specialist, system architect or a consultant who needs to undertake integration projects, then this book is for you.

                          The book assumes a certain level of familiarity with Relational database management systems with SQL and experience and Java.

                          Code Download and Errata
                          Packt Anytime, Anywhere
                          Register Books
                          Print Upgrades
                          eBook Downloads
                          Video Support
                          Contact Us
                          Awards Voting Nominations Previous Winners
                          Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
                          Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software