Reader small image

You're reading from  Getting Started with Talend Open Studio for Data Integration

Product typeBook
Published inNov 2012
Reading LevelIntermediate
PublisherPackt
ISBN-139781849514729
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Jonathan Bowen
Jonathan Bowen
author image
Jonathan Bowen

Jonathan Bowen is an E-commerce and Retail Systems Consultant and has worked in and around the retail industry for the past 20 years. His early career was in retail operations, then in the late 1990s he switched to the back office and has been integrating and implementing retail systems ever since. Since 2006, he has worked for one of the UKs largest e-commerce platform vendors as Head of Projects and, later, Head of Product Strategy. In that time he has worked on over 30 major e-commerce implementations. Outside of work, Jonathan, like many parents, has a busy schedule of sporting events, music lessons, and parties to take his kids to, and any downtime is often spent catching up with the latest tech news or trying to record electronic music in his home studio. You can get in touch with Jonathan at his website: www.learnintegration.com.
Read more about Jonathan Bowen

Right arrow

Preface

We've all been there. Your boss drops you an e-mail saying:

Good news, we've just bought system X, which is going to make our lives a lot easier. First though, we need to hook it up to system Y for daily product and inventory feeds and system Z to post the financials back for invoicing. Should be easy, right? It's going to be live in two months. Any problems, please let me know. Oh....if you can get some extracts for the data warehouse at the same time, that would be great too.

What to do? Well, you could ask your senior developer to code some integration jobs from scratch, but they might be hard to maintain, particularly if he/she left the company. In addition, you know he/she is working flat out on another important project. Alternatively, you could ask your boss if you can invest in a proprietary integration suite, with a legion of highly paid consultants. That will certainly do the job, but the budget, and timeline might not stretch to this.

Or you can take the new junior developer who joined your company a couple of weeks ago, dust off your business analyst and testing skills, and get the job done on time, on budget with Talend Open Studio for Data Integration.

Getting Started with Talend Open Studio for Data Integration is an introductory guide to solving this problem and many others like it.

What this book covers

Chapter 1, Knowing Talend Open Studio, introduces the reader to Talend Open Studio for Data Integration and what it can be used for. It also covers the installation of Talend Open Studio for Data Integration.

Chapter 2, Working with Talend Open Studio, introduces some common concepts the reader will come across when using Talend Open Studio for Data Integration, including creating a workspace to contain integration jobs, a tour of the Talend Open Studio for Data Integration interface, and use of metadata and schemas. We'll also build a simple "hello world" job.

Chapter 3, Transforming Files, gets into the detail of Talend Open Studio for Data Integration integrations and looks at using Talend Open Studio for Data Integration to transform files from one format to another.

Chapter 4, Working with Databases, looks at databases—how to get data out and how to get data in.

Chapter 5, Filtering, Sorting, and Other Processing Techniques, introduces common data operations: filtering, sorting, and aggregating.

Chapter 6, Managing Files, shows how to manage files during integration jobs. We'll look at renaming, moving, copying, and deleting files; how to timestamp a file; connecting to remote servers to FTP files; and zipping and unzipping files.

Chapter 7, Job Orchestration, will look at more complex integrations and how "one-shot" tasks can be combined to form multi-step jobs. We'll create subjobs and link them together using "if/then" logic. Integrations often produce temporary files, so we'll look at ways to clean up afterwards.

Chapter 8, Managing Jobs, covers the process of packaging, deploying, and scheduling jobs in a live environment.

Chapter 9, Global Variables and Contexts, looks at contexts and we explore how the same job can be used in different environments. We introduce dynamic variables, allowing our integration jobs to run flexibly, based on the current runtime information, rather than introducing complex, hardcoded routines.

Chapter 10, Worked Examples, brings together all of the knowledge from previous chapters in a series of worked examples. A real-life integration project is explored and developed to illustrate the use of Talend Open Studio for Data Integration "in the wild".

Appendix A, Installing Sample Jobs and Data, details how to obtain and use the sample data files required to follow the job development examples in the book. All of the jobs created throughout the book are also provided for reference.

Appendix B, Resources, highlights some resources and further reading to expand your knowledge of Talend Open Studio for Data Integration.

What you need for this book

The hardware and software requirements for this book are:

  • A computer running Windows, Linux, or Mac OS with Java installed

  • Talend Open Studio for Data Integration

  • A text file/XML editor

  • A MySQL database instance

Who this book is for

This book is for developers, business analysts, project managers, business intelligence specialists, system architects, and consultants who need to undertake integration projects. The book assumes a certain level of technical aptitude and readers should be comfortable with some of the following concepts and technologies:

  • Relational database management systems with some SQL (structured query language) experience

  • XML

  • Java

  • File Transfer Protocol (FTP)

  • Programming flow and logic

Conventions

In this book, you will find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning.

Code words in text are shown as follows: "Create a file delimited metadata for the currencies.csv file."

A block of code is set as follows:

String datestamp=TalendDate.getDate("YYYYMMDD");

globalMap.put("dateStamp",datestamp);

Any command-line input or output is written as follows:

sh [file name].sh

New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes for example, appear in the text like this: "Go to the Debug Run tab and click on Traces Debug".

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.

To send us general feedback, simply send an e-mail to , and mention the book title through the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/support, selecting your book, clicking on the errata submission form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website, or added to any list of existing errata, under the Errata section of that title.

Piracy

Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at with a link to the suspected pirated material.

We appreciate your help in protecting our authors, and our ability to bring you valuable content.

Questions

You can contact us at if you are having a problem with any aspect of the book, and we will do our best to address it.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Getting Started with Talend Open Studio for Data Integration
Published in: Nov 2012Publisher: PacktISBN-13: 9781849514729
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Jonathan Bowen

Jonathan Bowen is an E-commerce and Retail Systems Consultant and has worked in and around the retail industry for the past 20 years. His early career was in retail operations, then in the late 1990s he switched to the back office and has been integrating and implementing retail systems ever since. Since 2006, he has worked for one of the UKs largest e-commerce platform vendors as Head of Projects and, later, Head of Product Strategy. In that time he has worked on over 30 major e-commerce implementations. Outside of work, Jonathan, like many parents, has a busy schedule of sporting events, music lessons, and parties to take his kids to, and any downtime is often spent catching up with the latest tech news or trying to record electronic music in his home studio. You can get in touch with Jonathan at his website: www.learnintegration.com.
Read more about Jonathan Bowen