Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Getting Started with Talend Open Studio for Data Integration
Getting Started with Talend Open Studio for Data Integration

Getting Started with Talend Open Studio for Data Integration: This is the complete course for anybody who wants to get to grips with Talend Open Studio for Data Integration. From the basics of transferring data to complex integration processes, it will give you a head start.

By Jonathan Bowen
$48.99
Book Nov 2012 320 pages 1st Edition
eBook
$28.99 $19.99
Print
$48.99
Subscription
$15.99 Monthly
eBook
$28.99 $19.99
Print
$48.99
Subscription
$15.99 Monthly

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Black & white paperback book shipped to your address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now

Product Details


Publication date : Nov 6, 2012
Length 320 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781849514729
Category :
Table of content icon View table of contents Preview book icon Preview Book

Getting Started with Talend Open Studio for Data Integration

Chapter 1. Knowing Talend Open Studio

Ever since the second computer system came along, integrating systems has been a key part of the work of IT teams.

Today's IT landscape is increasingly complex, with enterprise resource planning (ERP), customer relationship management (CRM), finance, warehousing, human resources, and e-business systems, both within and outside the enterprise, all needing to exchange data. The real-time nature of business today and the fast pace of business change add to the need to have a set of tools and skills that make the business of integrating systems quick and easy. New systems come along all the time, but it is also a requirement to respond quickly to new business opportunities that drive system integrations. Company takeovers and mergers, new markets and customers, new suppliers, and joint ventures are commonplace events that all require data to be exchanged on a one-off or regular basis to make them work.

As you might expect, for such a critical systems-development activity, there is no end of options to choose from to fulfill the need. From complex multi-million dollar integration suites from the major systems vendors to humble, yet powerful, scripting languages such as Perl, there is something for every budget and taste. So what is Talend Open Studio for Data Integration and why should you consider it for your next integration project?

What Talend Open Studio is


Talend Open Studio for Data Integration is an open source graphical development environment for creating and deploying custom integrations between systems. It comes with over 600 pre-built connectors that make it quick and easy to connect databases, transform files, load data, move, copy and rename files, and connect individual components in order to define complex integration processes.

Talend Open Studio for Data Integration is a code generator, and so does a lot of the "heavy lifting" for you. As such, it is a suitable tool for experienced developers and non-developers alike. Talend Open Studio for Data Integration is easy to use and reduces the time taken to develop integrations from weeks and months to days or even hours.

Integration jobs are created from components that are configured rather than coded and jobs can be run from within the development environment or executed as standalone scripts.

Use cases

Some common use cases for Talend Open Studio for Data Integration include:

  • Data migration from one database to another: This is a common scenario when new systems are implemented or existing systems are upgraded. Data has to be populated into the new or upgraded system and database schemas may be subtly or completely different, requiring some modification of the data prior to loading. Data migrations tend to be "one-off" activities, not integrations that are deployed on an ongoing basis. The Studio facilitates data migrations through its many database connectors and actions.

  • Regular file exchanges between systems: The humble flat file is still a cornerstone of many systems integrations. Their low-tech approach makes them particularly suitable for batch processes when real-time data flows are unnecessary. File exchanges will often require some form of file transformation, either data content, data format, or both. The Studio has the ability to manage many different file formats and, with its file management capabilities such as FTP and archiving (zipping), is able to facilitate a full end-to-end file exchange process.

  • Data synchronization: Enterprises often have multiple data repositories of the same data. For example, data about customers might reside in the CRM system, the finance system, and the distribution system. They will probably have similar but different data models across these systems and every time a change is made in one, the same change needs to be made in the others—typically a time-consuming and manual process. The Studio can be used to keep the data in sync across systems with jobs that automate and transform the data transfer.

  • ETL (Extract, Transform, and Load): A key component process of a data warehouse or business intelligence system, ETL processes extract data from operational systems, transform the data, applying a series of rules or functions, and load the data into a database or data warehouse system.

History of Talend Open Studio

Talend was founded in 2005 and is an open source software vendor providing solutions for data integration, data quality, master data management, enterprise service bus, and business process management.

Talend's first product, Talend Open Studio for Data Integration, was launched in 2006, under the name Talend Open Studio, and has since been downloaded over 20 million times. Talend has continued to develop its product portfolio and has added complementary tools that provide a single platform for application, data, and process integration. The Talend Open Studio brand has since been adopted across the range of Talend's products.

Benefits of Talend Open Studio

An obvious question to ask is "Why should I use Talend Open Studio above other similar products? What can it do for me?" Talend Open Studio for Data Integration offers a number of benefits:

  • The Studio is open source, free to download and use, with access to the source code, allowing users to extend the product to their particular needs if required.

  • The Studio is a great productivity-booster. It's easy to learn and quick to develop with. Even novice developers will be building complex integrations in no time.

  • The Studio's pre-built components handle many common and not-so-common tasks. Developers can focus on the end-to-end process, rather than the low-level technical details.

  • Talend has an active and open user community. Practical, problem-solving advice is easy to access.

Installing Talend Open Studio


Before we can begin, we need to install the Studio. Talend provides installation guides and other material on its wiki at the following URL:

http://www.talendforge.org/wiki/doku.php?id=doc:installation_guide

We will also cover the basic installation instructions here.

Prerequisites

The Studio is a cross-platform application, running on Windows, Linux, and Mac OS. A list of hardware and software prerequisites can be found at http://www.talend.com/docs/community/prerequisites.html .

As a minimum, you will need a supported operating system, Java, and of course, the Studio itself.

Installation guide

The installation process for the Studio is essentially the same across all supported operating systems. We will show how to complete the installation on Windows, but you can follow the same steps on other platforms.

Follow the instructions given to install the Studio on Windows:

  1. Check to see if Java is installed on your computer by opening a command window and running the following command:

    java -version
    
  2. If Java is present, you will see a message showing which version is installed, as shown in the following screenshot:

    In the preceding screenshot, you can see that Version 1.7.0_05 of Java is installed. If Java is not present, you will get an error message, as shown in the following screenshot:

  3. If you need to install Java, visit the following URL to download a Java installer:

    http://www.oracle.com/technetwork/java/javase/downloads/index.html

    There are various versions of the Java Standard Edition JDK for different operating systems. Choose the appropriate version for your computer and download the installer to your computer.

  4. Once the installer is downloaded, click on the executable file to run it. Follow the instructions on the installer as it progresses.

  5. Now that Java is installed, we can download and install the Studio. Start by going to the Talend download page at the following URL:

    http://www.talend.com/download.php

  6. Choose the Data Integration tab and click on the Download button for Talend Open Studio for Data Integration, as shown in the following screenshot:

  7. Once it has downloaded, double-click on the executable to extract the Studio files as shown in the following screenshot:

  8. Follow the installation instructions on-screen. You will be prompted to choose an installation directory. Enter an appropriate location such as C:\Talend, as shown in the following screenshot:

    Once the installation is complete, you can start the Studio and start to develop jobs. See Chapter 2, Working with Talend Open Studio, for details on how to start the Studio.

Other useful software


In order to follow the sample jobs throughout the book, you may wish to install some additional software.

Text editor

A decent text editor will be very useful to view CSV and XML files. There are hundreds of text editors—both free and paid-for—and here are some recommendations if you don't already have a favorite:

MySQL

Chapter 4, Working with Databases, focuses on using the Studio to extract from and insert data into a relational database system. The Studio supports many different database systems, but for the examples in this book, we have chosen to use MySQL.

MySQL is the most popular open source relational database and is used by many large-scale applications and websites. It is free to use and there are a number of tools you can use to administer databases. To follow the examples as they are, use MySQL. However, if you have another preferred database you wish to use, it should not be too difficult to modify the job examples to incorporate other database components instead of the illustrated MySQL components.

MySQL Community Server can be downloaded from the following URL:

http://dev.mysql.com/downloads/mysql/

Installation instructions for various operating systems can be found at the following URL:

http://dev.mysql.com/doc/refman/5.1/en/installing.html

Once you have installed the MySQL server, download and install the client tools, which you can use to administer the database, view data, and so on. The MySQL Workbench can be downloaded from http://www.mysql.com/downloads/workbench/ .

MySQL Workbench documentation, including installation instructions, can be found at http://dev.mysql.com/doc/workbench/en/ .

Readers who wish to use other database systems can find a full list of supported databases at http://www.talendforge.org/components/ .

The list includes Oracle, DB2, MS SQL, Postgres, SQLite, and Sybase, among others. TOS also supports the JBDC API to connect to, and a relational database that supports this protocol.

Sample jobs and data


Each chapter of the book contains a number of example jobs that we will construct in a systematic manner. Readers are encouraged to follow the steps in order to get the most out of the book and consolidate their learning as they go. However, you can download and import the full set of example jobs if you wish.

Additionally, some jobs rely on database data and file-based data sources to work correctly. Again, these data sources can be downloaded and installed prior to working through the examples.

Appendix A, Installing Sample Jobs and Data, gives full instructions on downloading and installing the example jobs and data files.

Note

Note that some sample data files may have their encoding changed as they are downloaded, unzipped, and copied from one location to another. As a result you may occasionally get some encoding errors notified in the Studio. If this happens, open the offending file and ensure it is saved with the UTF-8 encoding.

Summary


Welcome to Talend Open Studio for Data Integration! In this chapter, we learned what the Studio is and what it can be used for. We walked through installing the Studio on your computer (along with some additional useful software).

Our next step is to log on to the Studio, become familiar with the Studio working environment, and create a simple job to illustrate the development workflow. All of this will be covered in Chapter 2, Working with Talend Open Studio.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Develop complex integration jobs without writing code
  • Go beyond "extract, transform and load"ù by constructing end-to-end integrations
  • Learn how to package your jobs for production use

Description

Talend Open Studio for Data Integration (TOS) is an open source graphical development environment for creating custom integrations between systems. It comes with over 600 pre-built connectors that make it quick and easy to connect databases, transform files, load data, move, copy and rename files and connect individual components in order to define complex integration processes. "Getting Started with Talend Open Studio for Data Integration" illustrates common uses and scenarios in a simple, practical manner and, building on knowledge as the book progresses, works towards more complex integration solutions. TOS is a code generator and so does a lot of the "heavy lifting"ù for you. As such, it is a suitable tool for experienced developers and non-developers alike. You'll start by learning how to construct some common integrations tasks ñ transforming files and extracting data from a database, for example. These building blocks form a "toolkit"ù of techniques that you will learn how to apply in many different situations. By the end of the book, once complex integrations will appear easy and you will be your organization's integration expert! Best of all, TOS makes integrating systems fun!

What you will learn

How to transform data files from one format to another Getting data in and out of a relational database Using common data operations such as filtering, sorting and aggregating Managing files ñ moving, copying, renaming and deleting Adding flow logic to integration jobs, including "if/then"ù operations and sequence dependencies How to use dynamic variables, avoiding hard-coded routines Using TOS in real-life scenarios with lots of tips and tricks Learn how to integrate data to and from many different sources

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Black & white paperback book shipped to your address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now

Product Details


Publication date : Nov 6, 2012
Length 320 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781849514729
Category :

Table of Contents

22 Chapters
Getting Started with Talend Open Studio for Data Integration Chevron down icon Chevron up icon
Credits Chevron down icon Chevron up icon
Foreword Chevron down icon Chevron up icon
Foreword Chevron down icon Chevron up icon
About the Author Chevron down icon Chevron up icon
Acknowledgement Chevron down icon Chevron up icon
About the Reviewers Chevron down icon Chevron up icon
www.PacktPub.com Chevron down icon Chevron up icon
Preface Chevron down icon Chevron up icon
Knowing Talend Open Studio Chevron down icon Chevron up icon
Working with Talend Open Studio Chevron down icon Chevron up icon
Transforming Files Chevron down icon Chevron up icon
Working with Databases Chevron down icon Chevron up icon
Filtering, Sorting, and Other Processing Techniques Chevron down icon Chevron up icon
Managing Files Chevron down icon Chevron up icon
Job Orchestration Chevron down icon Chevron up icon
Managing Jobs Chevron down icon Chevron up icon
Global Variables and Contexts Chevron down icon Chevron up icon
Worked Examples Chevron down icon Chevron up icon
Installing Sample Jobs and Data Chevron down icon Chevron up icon
Resources Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Filter icon Filter
Top Reviews
Rating distribution
Empty star icon Empty star icon Empty star icon Empty star icon Empty star icon 0
(0 Ratings)
5 star 0%
4 star 0%
3 star 0%
2 star 0%
1 star 0%

Filter reviews by


No reviews found
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is the delivery time and cost of print book? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela
What is custom duty/charge? Chevron down icon Chevron up icon

Customs duty are charges levied on goods when they cross international borders. It is a tax that is imposed on imported goods. These duties are charged by special authorities and bodies created by local governments and are meant to protect local industries, economies, and businesses.

Do I have to pay customs charges for the print book order? Chevron down icon Chevron up icon

The orders shipped to the countries that are listed under EU27 will not bear custom charges. They are paid by Packt as part of the order.

List of EU27 countries: www.gov.uk/eu-eea:

A custom duty or localized taxes may be applicable on the shipment and would be charged by the recipient country outside of the EU27 which should be paid by the customer and these duties are not included in the shipping charges been charged on the order.

How do I know my custom duty charges? Chevron down icon Chevron up icon

The amount of duty payable varies greatly depending on the imported goods, the country of origin and several other factors like the total invoice amount or dimensions like weight, and other such criteria applicable in your country.

For example:

  • If you live in Mexico, and the declared value of your ordered items is over $ 50, for you to receive a package, you will have to pay additional import tax of 19% which will be $ 9.50 to the courier service.
  • Whereas if you live in Turkey, and the declared value of your ordered items is over € 22, for you to receive a package, you will have to pay additional import tax of 18% which will be € 3.96 to the courier service.
How can I cancel my order? Chevron down icon Chevron up icon

Cancellation Policy for Published Printed Books:

You can cancel any order within 1 hour of placing the order. Simply contact customercare@packt.com with your order details or payment transaction id. If your order has already started the shipment process, we will do our best to stop it. However, if it is already on the way to you then when you receive it, you can contact us at customercare@packt.com using the returns and refund process.

Please understand that Packt Publishing cannot provide refunds or cancel any order except for the cases described in our Return Policy (i.e. Packt Publishing agrees to replace your printed book because it arrives damaged or material defect in book), Packt Publishing will not accept returns.

What is your returns and refunds policy? Chevron down icon Chevron up icon

Return Policy:

We want you to be happy with your purchase from Packtpub.com. We will not hassle you with returning print books to us. If the print book you receive from us is incorrect, damaged, doesn't work or is unacceptably late, please contact Customer Relations Team on customercare@packt.com with the order number and issue details as explained below:

  1. If you ordered (eBook, Video or Print Book) incorrectly or accidentally, please contact Customer Relations Team on customercare@packt.com within one hour of placing the order and we will replace/refund you the item cost.
  2. Sadly, if your eBook or Video file is faulty or a fault occurs during the eBook or Video being made available to you, i.e. during download then you should contact Customer Relations Team within 14 days of purchase on customercare@packt.com who will be able to resolve this issue for you.
  3. You will have a choice of replacement or refund of the problem items.(damaged, defective or incorrect)
  4. Once Customer Care Team confirms that you will be refunded, you should receive the refund within 10 to 12 working days.
  5. If you are only requesting a refund of one book from a multiple order, then we will refund you the appropriate single item.
  6. Where the items were shipped under a free shipping offer, there will be no shipping costs to refund.

On the off chance your printed book arrives damaged, with book material defect, contact our Customer Relation Team on customercare@packt.com within 14 days of receipt of the book with appropriate evidence of damage and we will work with you to secure a replacement copy, if necessary. Please note that each printed book you order from us is individually made by Packt's professional book-printing partner which is on a print-on-demand basis.

What tax is charged? Chevron down icon Chevron up icon

Currently, no tax is charged on the purchase of any print book (subject to change based on the laws and regulations). A localized VAT fee is charged only to our European and UK customers on eBooks, Video and subscriptions that they buy. GST is charged to Indian customers for eBooks and video purchases.

What payment methods can I use? Chevron down icon Chevron up icon

You can pay with the following card types:

  1. Visa Debit
  2. Visa Credit
  3. MasterCard
  4. PayPal
What is the delivery time and cost of print books? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela