Dear reader, welcome to an intuitive way of data analysis. Using a visual programming language based on dataflows, you can create an easy-to-understand analysis process, while it internally checks signals about some of the common problems. Obviously, any environment that does not help with proper documentation would be destined to fail, but KNIME's success is based not just on its high quality—cross-platform—code, but also on the good description about what it does and how you can use the building blocks.
This book covers the most common tasks that are required during the data preparation and visualization phase of data analysis using KNIME. Because of the size constraints—and to bring the best price/value for those who are already familiar with or not interested in modeling—we have not covered the modeling and machine learning algorithms available for KNIME. If you are already familiar with these algorithms, you will easily get familiar with the options in KNIME, and these are quite obvious to use, so you lose almost nothing. If you have not found time yet to get acquainted with these concepts, we encourage you to first learn for what these procedures are good and when you should use them. There are some good books, courses, and training available—these are the ideal options for learning—but the Wikipedia articles can also give you a basic introduction specific to the algorithm you want to use.
Chapter 1, Installation and Using KNIME, introduces the user interface, the concepts used in the first three chapters, and how you can install and configure KNIME and its extensions.
Chapter 2, Data Preprocessing, covers the most common tasks, so that you can analyze your data, such as loading, transforming, and generating data; it also introduces the powerful regular expressions and some case studies.
Chapter 3, Data Exploration, describes how you can use KNIME to get an overview about your data, how you can visualize them in different forms, or even create publication quality figures.
Chapter 4, Reporting, introduces the KNIME reporting extension with the specific concepts, the user interface, and the basic blocks of reports.
You only need a KNIME-compatible operating system, which is either a modern Linux, Mac OS X (10.6 or above), or Windows XP or above. The Java runtime is bundled with KNIME, and the first chapter describes how you can download and install KNIME. For this reason, you will need Internet connection too.
This book is designed to give a good start to the data scientists who are not familiar with KNIME yet. Others, who are not familiar with programming, but need to load and transform their data in an intuitive way might also find this book useful.
In this book, you will find a number of styles of text that distinguish among different kinds of information. Here are some examples of these styles, and an explanation of their meaning.
Code words in text are shown as follows: " In the first case, you have not much control about the details, for example, a Pattern
object will be created for each call of the facade methods delegating to the Pattern
class "
A block of code is set as follows:
// system imports // Your custom imports: import java.util.regex.*; // system variables // Your custom variables: Pattern tuplePattern = Pattern.compile("\\((\\d+),\\s*(\\d+)\\)"); // expression start // Enter your code here: if (c_edge != null) { Matcher m = tuplePattern.matcher(c_edge); if (m.matches()) { out_edge = m.replaceFirst("($2, $1)"); } else { out_edge = "NA"; } } else { out_edge = null; } // expression end
When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
// system imports // Your custom imports: import java.util.regex.*; // system variables // Your custom variables: Pattern tuplePattern = Pattern.compile("\\((\\d+),\\s*(\\d+)\\)"); // expression start // Enter your code here: if (c_edge != null) { Matcher m = tuplePattern.matcher(c_edge); if (m.matches()) { out_edge = m.replaceFirst("($2, $1)"); } else { out_edge = "NA"; } } else { out_edge = null; } // expression end
Any command-line input or output is written as follows:
$ tar –xvzf knime_2.8.0.linux.gtk.x86_64.tar.gz –C /path/to/extract
New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes for example, appear in the text like this: "Eclipse's main window is the workbench".
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.
To send us general feedback, simply send an e-mail to <feedback@packtpub.com>
, and mention the book title via the subject of your message.
If there is a topic in which you have expertise, and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the errata submission form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded on our website, or added to any list of existing errata, under the Errata section of that title. Any existing errata can be viewed by selecting your title from http://www.packtpub.com/support.
Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at <copyright@packtpub.com>
with a link to the suspected pirated material.
We appreciate your help in protecting our authors, and our ability to bring you valuable content.
You can contact us at <questions@packtpub.com>
if you are having a problem with any aspect of the book, and we will do our best to address it.