Reader small image

You're reading from  Talend Open Studio Cookbook

Product typeBook
Published inOct 2013
Reading LevelIntermediate
PublisherPackt
ISBN-139781782167266
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Rick Barton
Rick Barton
author image
Rick Barton

Rick Barton is a freelance consultant who has specialized in data integration and ETL for the last 13 years as part of an IT career spanning over 25 years. After gaining a degree in Computer Systems from Cardiff University, he began his career as a firmware programmer before moving into Mainframe data processing and then into ETL tools in 1999. He has provided technical consultancy to some of the UKs largest companies, including banks and telecommunications companies, and was a founding partner of a Big Data integration consultancy. Four years ago he moved back into freelance development and has been working almost exclusively with Talend Open Studio and Talend Integration Suite, on multiple projects, of various sizes, in UK. It is on these projects that he has learned many of the lessons that can be found in this, his first book.
Read more about Rick Barton

Right arrow

Chapter 12. Common Mistakes and Other Useful Hints and Tips

This chapter contains a collection of useful tips and information that should help resolve some common issues and answer some common questions.

  • My tab is missing

  • Finding code routine

  • Finding a new context variable

  • Missing reload at each row global variable

  • Dragging component globalMap variables

  • Some complex date formats

  • Capturing tMap rejects

  • Adding job name, project name, and other job-specific information

  • Printing tMap variables

  • Stopping memory errors in Talend

Introduction


This chapter is unlike any of the other chapters, because it doesn't contain a set of exercises, rather it is a collection of useful information and techniques that don't really fit into the earlier chapters.

It is impossible to include everything that is missing from the previous chapters, so we have tried to incorporate hints and tips that we believe will prove most useful.

My tab is missing


If you find that, say, your Run job or context tab has gone missing, perhaps as a result of you accidentally closing them, then there are two options for getting them back.

How to do it…

The first option will restore a tab, the second will reset your whole UI.

Show view:

This method allows you to simply restore a missing tab.

  1. In show view method, Click on Window then click on Show view.

  2. Open the Talend folder if it isn't already open then click on the tab that you are missing.

Reset the perspective

This option allows you to reset the UI to its original format, so is more disruptive than the previous method.

  1. In reset the perspective method, at the top right-hand side of the Studio is a list of perspectives.

  2. Click the integration perspective option.

  3. Right-click then click, on Reset, as shown in the next screenshot:

  4. Click on OK on the dialog and your whole Integration view will be reset to the default, which will return your missing tabs.

Finding the code routine


Occasionally, when you call a Talend function or a function that you have created in a code routine, you will receive a compilation error message about your routine, such as: myRoutines cannot be resolved.

This is usually because the link between the code routine and the job has been lost. This can easily be re-established.

How to do it…

  1. Close the job on which you are working.

  2. Right-click the job in the Repository panel, and click Setup routine dependencies.

  3. You should find that your routine is missing from the list that is displayed in the following dialogue. (Note we do not have any attached routines here).

  4. Click on +, then select your routine from the list that is then displayed. You should then see your routine in the User routines tab.

Finding a new context variable


If you add a new variable to a context group in the Repository while you have a job open (which is a normal thing to do), then Talend will not automatically add it to your job.

This means that when you run your job, expecting your new context variable to be present, you will get a compile error.

How to do it…

  1. Open the context tab in your job.

  2. Click the group you have just changed, then the button.
  3. You will see that the tick box is a blob, not a tick.

  4. If you expand the context, you will see your new variable, but it will not be ticked.

  5. Tick your variable to include it in your job, and exit the context tab.

Reloads going missing at each row global variable


When using reload at each row with globalMap Key (as seen in the next screenshot), Talend allows you to cut and paste expressions into the globalMap variable, but when you go out of the tMap component and come back in again, you will see that it hasn't changed.

How to do it...

To get around this, you have one of two options:

  1. Drag the field from an input source. This option is limited, in that the expression will be just the field name, so you cannot apply any other logic to the variable, such as substring or uppercase.

  2. The second (and preferred option) is to edit the expression in the Expression editor tab. This method allows any expression to be coded to ensure that the variable is set correctly, as shown in the next screenshot:

Dragging component globalMap variables


All components produce one or more globalMap variables that can be used within other components, such as tJavaRow.

If you do have lots of components, then using Ctrl + Space to locate your specific globalMap variable may be difficult.

A simpler method is to open the component tab for the component, ensuring that it is in panel mode, and that you can see the outline view in the bottom right-hand side of the studio.

You can then simply expand the given component and drag the variables from the outline panel into your code panel, as shown in the next screenshot:

Some complex date formats


Java provides a wide range of date options that can be used to define date formats, but sometimes the options to choose for a particular date time string aren't immediately obvious.

Some date formats that may prove useful are as follows:

  • ISO 8601 with offset standard: This format contains date, time, and the offset from UTC, as well as the T character that designates the start of the time, for example, 2007-04-05T12:30:22-02:00.

    The pattern for this date and time format is yyyy-MM-dd'T'HH:mm:ssXXX.

  • Mtime pattern: The tFileProperties component returns a field named mtime_string that is a string representation of a date and time format, for example, Wed Mar 13 23:53:07 GMT 2013.

    The pattern for this date and time format is EEE MMM dd HH:mm:ss z yyyy.

Capturing tMap rejects


The tMap component is the most powerful and flexible of the Talend components, but unless you know where to look, some of the options available aren't immediately obvious. Take for example, the Die on error flag.

For most components, this is in the main component panel, but for tMap, it is in the tMap configurations dialog, as shown in the next screenshot:

Unchecking the Die on error box will create a new output error flow called ErrorReject, containing a message and a stack trace. Additional fields may be added if required, as shown in the next screenshot:

Adding job name, project name, and other job specific information


Often, for logging or error messaging purposes it is required to capture information about the job, such as the job name or the project name.

Three common values that can be used in a job are shown in the following table:

Job version

jobVersion

Job name

jobName

Talend project name

projectName

Talend also stores a host of other variables, such as parent and child process IDs that can be easily found by opening an empty job and inspecting the Java code.

Printing tMap variables


If you inspect code generated from a tMap variable, you will see that each of the expressions are converted into a line of the following format:output column = expression;.

This suggests that the expression is limited to one line of Java code.

Although this is how we would normally treat tMap expressions (in order to avoid confusion), this isn't strictly true, and there is one scenario where breaking this rule may be useful.

The scenario in question relates to tMap variables. If a tMap variable fails due to an exception in a variable expression that is itself a result of a variable expression, then the job can become quite difficult to debug.

To make it easier to see what is happening in each step, we can add a System.out.println code to an expression to print the state prior to execution of the failing step.

In this case, we simply force the expression logic in the generated code to become: output column = expression; System.out.println(output column);

This is how it looks...

Stopping memory errors in Talend


When dealing with large amounts of data, there is often a trade-off between performance and memory usage, so it is likely that at some point in your Talend career, you will encounter a problem which is memory related.

This section will cover many of the actions that can be taken to ensure that you are able to deal with your memory errors quickly and efficiently.

Increasing the memory allocated to a job

If you have enough memory and yet your job is failing, then it is worth increasing the amount of memory available to the job you are running. You can do this by changing the value of the Java Xmx setting.

This setting is available via the Advanced Settings option from the Run tab, as shown in the next screenshot. Simply tick the box for Use specific JVM arguments, and change the value to suit your needs. Note that you can use g for gigabytes, for example, –Xmx3g.

Reducing lookup data

The tMap lookup data is by default stored in memory, so large lookups will consume...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Talend Open Studio Cookbook
Published in: Oct 2013Publisher: PacktISBN-13: 9781782167266
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Rick Barton

Rick Barton is a freelance consultant who has specialized in data integration and ETL for the last 13 years as part of an IT career spanning over 25 years. After gaining a degree in Computer Systems from Cardiff University, he began his career as a firmware programmer before moving into Mainframe data processing and then into ETL tools in 1999. He has provided technical consultancy to some of the UKs largest companies, including banks and telecommunications companies, and was a founding partner of a Big Data integration consultancy. Four years ago he moved back into freelance development and has been working almost exclusively with Talend Open Studio and Talend Integration Suite, on multiple projects, of various sizes, in UK. It is on these projects that he has learned many of the lessons that can be found in this, his first book.
Read more about Rick Barton