Getting Started with Talend Open Studio for Data Integration
Formats:
save 15%!
save 37%!
Free Shipping!
| Also available on: |
|
- Develop complex integration jobs without writing code
- Go beyond “extract, transform and load” by constructing end-to-end integrations
- Learn how to package your jobs for production use
Book Details
Language : EnglishPaperback : 320 pages [ 235mm x 191mm ]
Release Date : November 2012
ISBN : 1849514720
ISBN 13 : 9781849514729
Author(s) : Jonathan Bowen
Topics and Technologies : All Books, Data, Open Source
Table of Contents
PrefaceChapter 1: Knowing Talend Open Studio
Chapter 2: Working with Talend Open Studio
Chapter 3: Transforming Files
Chapter 4: Working with Databases
Chapter 5: Filtering, Sorting, and Other Processing Techniques
Chapter 6: Managing Files
Chapter 7: Job Orchestration
Chapter 8: Managing Jobs
Chapter 9: Global Variables and Contexts
Chapter 10: Worked Examples
Appendix A: Installing Sample Jobs and Data
Appendix B: Resources
Index
- Chapter 1: Knowing Talend Open Studio
- What Talend Open Studio is
- Use cases
- History of Talend Open Studio
- Benefits of Talend Open Studio
- Installing Talend Open Studio
- Prerequisites
- Installation guide
- Other useful software
- Text editor
- MySQL
- Sample jobs and data
- Summary
- Chapter 2: Working with Talend Open Studio
- Studio definitions
- Starting the Studio
- Tour of the Studio
- The Repository
- The design workspace
- The Palette
- Configuration tabs
- Outline and Code panels
- Creating a new project
- Creating an example job
- Metadata
- Summary
- Chapter 3: Transforming Files
- Transforming XML to CSV
- Transforming CSV to XML
- Maps and expressions
- Advanced XML output for complex XML structures
- Working with multi-schema XML files
- Enriching data with lookups
- Extracting data from Excel files
- Extracting data from multiple sheets
- Joining data from multiple sheets
- Summary
- Chapter 4: Working with Databases
- Database metadata
- Extracting data from a database
- Extracts from multiple tables
- Joining within the database component
- Joining outside the database component
- Writing data to a database
- Database to database transfer
- Modifying data in a database
- Dynamic database lookup
- Summary
- Chapter 5: Filtering, Sorting, and Other Processing Techniques
- Filtering data
- Simple filter
- Filter and rejects
- Filter and split
- Sorting data
- Aggregating data
- Normalizing and denormalizing data
- Data normalization
- Data denormalization
- Extracting delimited fields
- Find and replace
- Sampling rows
- Summary
- Chapter 6: Managing Files
- Managing local files
- Copying files
- Copying and removing files
- Renaming files
- Deleting files
- Timestamping a file
- Listing files in a directory
- Checking for files
- Archiving and unarchiving files
- FTP file operations
- FTP Metadata
- FTP Put
- FTP Get
- FTP File Exist
- FTP File List and Rename
- Deleting files on an FTP server
- Summary
- Chapter 7: Job Orchestration
- What is a subjob
- A simple subjob
- On Subjob Error
- On Component OK
- Run If
- Jobs as subjobs
- Iterating and looping
- Iterate connections
- ForEach loop
- Loop "n" times
- Infinite loop
- Duplicating and merging dataflows
- Duplicating data
- Merging data
- Summary
- Chapter 8: Managing Jobs
- Job versions
- Exporting and importing jobs
- Exporting jobs
- Exporting a project
- Exporting a job
- Exporting a job for execution
- Importing jobs
- Importing a project
- Importing a job
- Scheduling jobs
- Summary
- Chapter 9: Global Variables and Contexts
- Global variables
- Studio global variables
- User defined global variables
- Contexts
- Embedded context variables
- Repository context variables
- External context variables
- Complex context variables
- Using embedded, repository, and external contexts
- Summary
- Chapter 10: Worked Examples
- Product catalog
- Data import from the ERP system
- Data import from Fabric Fashions
- Data import from Runway Collections
- Product inventory data
- Order file processing
- Order status updates
- Automating processes
- E-mailing daily sales
- Automating product visibility
- Summary
- Appendix A: Installing Sample Jobs and Data
- Downloading job and data files
- Sample data files
- Sample database
- Sample jobs
- Appendix B: Resources
- Talend documentation
- TalendForge forum
- Webinars
- Tutorials
- Talend Exchange
Jonathan Bowen
Code Downloads
Download the code and support files for this book.
Submit Errata
Please let us know if you have found any errors not listed on this list by completing our errata submission form. Our editors will check them and add them to this list. Thank you.
Errata
- 3 submitted: last submission 25 Apr 2013The code for this book has been re-uploaded.
Errata type: Technical | Page number: 41
By default, this is configured to 50, but let's change this to 0, which is the number used to configure no limit.
Should be: By default, this is configured to 50, but let's change this to -1, which is the number used to configure no limit.
Errata type: Technical | Page number: 61
Chapter 3, Page 61, Bullet point 1: In this bullet point, the exercise is configuring the output so that the ID field (of the output) is the customer ID of the input, left-padded with zeros. In order for the left-padding format to be allowable, the data type of the output ID must be set to “string” and not stay as “integer”, which is its original type.
Sample chapters
You can view our sample chapters and prefaces of this title on PacktLib or download sample chapters in PDF format.
- How to transform data files from one format to another
- Getting data in and out of a relational database
- Using common data operations such as filtering, sorting and aggregating
- Managing files – moving, copying, renaming and deleting
- Adding flow logic to integration jobs, including “if/then” operations and sequence dependencies
- How to use dynamic variables, avoiding hard-coded routines
- Using TOS in real-life scenarios with lots of tips and tricks
- Learn how to integrate data to and from many different sources
Talend Open Studio for Data Integration (TOS) is an open source graphical development environment for creating custom integrations between systems. It comes with over 600 pre-built connectors that make it quick and easy to connect databases, transform files, load data, move, copy and rename files and connect individual components in order to define complex integration processes.
"Getting Started with Talend Open Studio for Data Integration" illustrates common uses and scenarios in a simple, practical manner and, building on knowledge as the book progresses, works towards more complex integration solutions.
TOS is a code generator and so does a lot of the “heavy lifting” for you. As such, it is a suitable tool for experienced developers and non-developers alike. You'll start by learning how to construct some common integrations tasks – transforming files and extracting data from a database, for example. These building blocks form a “toolkit” of techniques that you will learn how to apply in many different situations.
By the end of the book, once complex integrations will appear easy and you will be your organization’s integration expert!
Best of all, TOS makes integrating systems fun!
"Getting Started with Talend Open Studio for Data Integration" takes a step-by-step, hands-on approach to learning with lots of examples and clear instructions.
Are you a developer, business analyst, project manager, business intelligence specialist, system architect or a consultant who needs to undertake integration projects, then this book is for you.
The book assumes a certain level of familiarity with Relational database management systems with SQL and experience and Java.

