Reader small image

You're reading from  IBM SPSS Modeler Essentials

Product typeBook
Published inDec 2017
PublisherPackt
ISBN-139781788291118
Edition1st Edition
Right arrow
Authors (2):
Jesus Salcedo
Jesus Salcedo
author image
Jesus Salcedo

Jesus Salcedo has a PhD in psychometrics from Fordham University. He is an independent statistical consultant and has been using SPSS products for over 20 years. He is a former SPSS Curriculum Team Lead and Senior Education Specialist who has written numerous SPSS training courses and trained thousands of users.
Read more about Jesus Salcedo

Keith McCormick
Keith McCormick
author image
Keith McCormick

Keith McCormick is a career long practitioner of predictive analytics and data science. He has engaged in statistical modeling, data mining, and mentoring others in the area for more than 20 years. He has a particular expertise in helping organizations perform their first predictive analytics project or build their first predictive analytics practice, and has done so in a variety of industries including healthcare, banking, telecommunications, non-profit, direct mail, pharmaceuticals, and retail. Keith is also an established author and speaker with four books in print, or under contract. Although his consulting work is not restricted to any one tool, his writing and speaking has made him particularly well known in the IBM SPSS Statistics and IBM SPSS Modeler communities.
Read more about Keith McCormick

View More author details
Right arrow

Chapter 2. The Basics of Using IBM SPSS Modeler

The previous chapter introduced the notion of data mining and the CRISP-DM process model. You learned what data mining is, why you would want to use it, and some of the types of questions you could answer with data mining. The rest of this book is going to focus on how you actually do some of the aspects of data mining—reading data, exploring variables, deriving new fields, developing models, and so on. However, before we can get started with these different data mining projects, we first need to become familiar with the software that we will use to work on the data. In this chapter, you will learn the following:

  • Get an overview of the Modeler interface
  • Learn how to build streams
  • Get an introduction to various help options

Introducing the Modeler graphic user interface


IBM SPSS Modeler can be thought of as a data mining workbench that combines multiple tools and technologies to support the data mining process. Modeler allows users to mine data visually on the stream canvas.

The following figure shows the different areas of the Modeler interface:

As you can see, the Modeler interface is comprised of several components, and these are described in the next few pages.

Stream canvas

The stream canvas is the main work area in Modeler. It is located in the center of the Modeler user interface. The stream canvas can be thought of as a surface on which to place icons or nodes. These nodes represent operations to be carried out on the data. Once nodes have been placed on the stream canvas, they can be linked together to form a stream.

Palettes

Nodes (operations on the data) are contained in palettes. The palettes are located at the bottom of the Modeler user interface. Each palette contains a group of related nodes that are...

Building streams


As was mentioned previously, Modeler allows users to mine data visually on the stream canvas. This means that you will not be writing code for your data mining projects; instead you will be placing nodes on the stream canvas. Remember that nodes represent operations to be carried out on the data. So once nodes have been placed on the stream canvas, they need to be linked together to form a stream. A stream represents the flow of data going through a number of operations (nodes). The following diagram is an example of nodes on the canvas, as well as a stream:

Given that you will spend a lot of time building streams, in this section you will learn the most efficient ways of manipulating nodes to create a stream.

Mouse buttons

When building streams, mouse buttons are used extensively so that nodes can be brought onto the canvas, connected, edited, and so on. When building streams within Modeler, mouse buttons are used in the following ways:

  • The left button is used for selecting...

Modeler stream rules


You may have noticed that in the previous example, we connected the Var. File node to the Table node and this worked fine. However, what if instead we tried to connect the Table node to the Var. File node? Let's try it:

  1. Right-click the Table node.
  2. Select Connect from the Context menu (notice that the Connect option does not exist).

Let's try something different:

  1. Bring a Statistics File node onto the canvas.
  2. Right-click on the Var. File node.
  3. Select Connect from the Context menu.
  4. Click the Statistics File node (notice that you get an error message when you try to connect these two nodes).

The reason we are experiencing these issues is that there are rules for creating Modeler streams.

Modeler streams are typically comprised of three types of nodes: Source, Process, and Terminal nodes. Connecting nodes in certain ways makes sense in the context of Modeler, and other connections are not allowed.

In terms of general rules, streams always start with a Source node (a node from the Sources...

Help options


When using Modeler, at some point we are going to need help. Modeler provides various help options.

Help menu

The most intuitive way to get help is to use the Help menu. As seen in the following figure, the Help menu provides several options:

  • Help Topics takes you to the Help System, where you can search for various topics
  • CRISP-DM Help provides an introduction to the CRISP-DM methodology
  • Application Examples offers a variety of real-life examples of using common data mining techniques for data preparation and modeling
  • Accessibility Help informs users about keyboard alternatives to using the mouse
  • What's This changes the cursor into a question mark and provides information about any Modeler item you select

Dialog help

Perhaps the most useful help option is to use context sensitive help, which is available in whatever dialog box you are currently working on. For example, let's say that you are using the Var. File node and you either did not know how to use this node or you were unfamiliar...

Summary


In this chapter, you learned about the different components of the Modeler graphic user interface. You also learned how to build streams. Finally, you were introduced to various help options.

In the next chapter, we will take a detailed look at how to bring data into Modeler. We will also discuss how to properly set up the metadata for your fields.

 

lock icon
The rest of the chapter is locked
You have been reading a chapter from
IBM SPSS Modeler Essentials
Published in: Dec 2017Publisher: PacktISBN-13: 9781788291118
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Jesus Salcedo

Jesus Salcedo has a PhD in psychometrics from Fordham University. He is an independent statistical consultant and has been using SPSS products for over 20 years. He is a former SPSS Curriculum Team Lead and Senior Education Specialist who has written numerous SPSS training courses and trained thousands of users.
Read more about Jesus Salcedo

author image
Keith McCormick

Keith McCormick is a career long practitioner of predictive analytics and data science. He has engaged in statistical modeling, data mining, and mentoring others in the area for more than 20 years. He has a particular expertise in helping organizations perform their first predictive analytics project or build their first predictive analytics practice, and has done so in a variety of industries including healthcare, banking, telecommunications, non-profit, direct mail, pharmaceuticals, and retail. Keith is also an established author and speaker with four books in print, or under contract. Although his consulting work is not restricted to any one tool, his writing and speaking has made him particularly well known in the IBM SPSS Statistics and IBM SPSS Modeler communities.
Read more about Keith McCormick