Alteryx Analytics is a tremendous platform allowing analysts to easily prep, blend, and analyze all their data using a repeatable workflow. Many business groups, such as marketing, finance, healthcare, and sales find it difficult to quickly analyze data they can act upon instantly using legacy approaches, such as Microsoft Excel and other platforms. Alteryx solves these problems with a seamless process, by using tools to gather, cleanse, and join data from different sources. This repeatable workflow for self-service data analytics delivers deeper insights in hours, not weeks.
You will be accomplishing the task of building and publishing analytic models by using tools in a drag-and-drop environment within the same intuitive user interface. You’ll learn data preparation and data cleansing from spreadsheets and other sources to determine key insights and how share those key insights.
In this chapter, we'll focus on the following foundation topics:
- Downloading and installing Alteryx Designer
- An introduction to Alteryx Designer and what makes it such a powerful self-service analytics platform
- A look inside the Alteryx Designer architecture and understanding how the Alteryx Engine drives data processing in a repeatable workflow
- An overview of the workflow configurations, ensuring indispensable selections are met to create an optimal workflow
- Exploring the tool palettes filled with endless tools within various tool palettes (Getting familiar with the tool palettes will provide quick and easy access to designing your workflow)
- The Favorites tools to categorize and save most utilized tools
In this section, we will learn how to install Alteryx Designer to begin using the ultimate platform for self-service analytics. There are a few key items to note before Alteryx Designer is installed.
No licensing action is required when upgrading the Alteryx software. The existing Alteryx license will continue to function when a new version is installed. Due to compatibility concerns, workflows developed in a newer version will not launch in the previous versions and workflows in the older versions will continue to work when a newer version is installed. Lastly, the technical specifications are key to ensure system requirements are met before installing. Alteryx Analytics 11.0, which is what we will install for this book and future versions, is only available for 64-bit machines with operating system Microsoft Windows 7 or later. Following is a chart that illustrates the minimum system requirements:
Operating System Requirements
Microsoft Windows 7 or later
Quad core i7
8 GB RAM
500GB - 1TB
We'll now walk through the steps of downloading and installing Alteryx Designer. Before we get started, make sure to close all instances of Alteryx that you may have open:
- Navigate to the Alteryx
Downloadssite http://downloads.alteryx.com/ to download the software and select the
- Once the
Alteryx Designeroption is selected, the option for application download will appear at the bottom of your browser. Select this download.
- The open file dialog will appear. Select
- Two options will appear for the Download Manager. The first is the typical option that includes non-predictive tools. We will cover Predictive Analytics in this book, so we will select the second option, advanced.
User Account Controlwill prompt. Select
Yesto begin the installation.
Installdialog will appear where the Setup begins. The previous versions will be uninstalled. Select
Nextto install the pre-requisites.
- The installation process begins by selecting
Nextwhen necessary. Read and accept the license agreement and select
Finishto complete the installation.
The Alteryx Designer is an intuitive drag-and-drop user interface for users to drag tools from a Tool Palette onto the canvas. These tools can be used to create Alteryx workflows, macros, and applications. This allows the users to run workflows instantly to process data. Alteryx Designer processes workflows from a local instance of the Alteryx Engine and is written primarily in C#. Users may publish their workflows, macros, and applications to the Alteryx Analytics gallery, where others can download and run them. Workflows can be scheduled at fixed times or at recurring intervals through the Alteryx Server deployment. Alteryx Designer has a Scheduler interface located within it to execute scheduled workflows.
The Alteryx Engine, written in C++, runs a workflow and produces the output from a workflow built in Alteryx Designer. The Engine processes the data sources in-memory once the workflow is running. Processing will be written to temporary files on a disk and deleted once the processing is complete after surpasses memory limitations. We installed Alteryx by selecting the option to install the suite of R tools used for predictive analysis.
Alteryx installs the R tools, used for statistical computing and graphics, through the R program and provides a connection between the Alteryx Engine and the R Engine. This allows for the tools to function in the workflow. A command line is used by the Alteryx Engine to communicate with the R Engine.
The Alteryx Engine may execute the following tasks depending on the workflow:
- Read or write input/output files and one or more databases
- Process external runtime commands
- Send email to the email server through SMTP
- Upload or download data from the web
Let's dive a little deeper into the Alteryx Engine on how it gets deployed across multiple servers. The Alteryx Service, written in C++ and C# wrappers, allows the Alteryx Engine to deploy the execution of workflows, management, and scheduling. This is accomplished by using a Controller-Worker architecture. The server utilizes the Controller to manage the jobs scheduled to run and the Worker performs the work. The Alteryx application files and job queues are stored by the Alteryx Persistence tier to perform the operations of the Alteryx Service.
The Alteryx Service Controller is responsible for the delegation of work and management of the service settings to the Alteryx Service Workers. When jobs are received from the Scheduler, the Controller views them within the persistence layer, where all queued jobs are maintained, and then delegates the jobs to the workers. This is where Alteryx Service Worker comes into action, as the Worker runs the job and produces the output. The system performance determines how many Workers are needed to run the jobs.
The Controller's name or the IP address and the security token for that Controller must be specified for the Controller-Worker to communicate if the Worker is not the same machine as the Controller.
The Alteryx architecture process flow diagram begins from the drag-and-drop workflow tools to executing results through the Alteryx Engine:
Alteryx provides the capability to quickly prepare and blend your data in a repeatable workflow, without the need for data science programming skills. The data acquisition is spreading rapidly across organizations with an opportunity to join millions of rows of data from multiple data sources. Traditional platforms, like Microsoft Excel, aren't designed to handle such volume of data. In addition, the drag-and-drop workflow offers the data cleansing techniques that take minutes to produce, whereas the traditional tools would take weeks to produce the same output.This traditional tools will slow down the turn-around time for an analyst to solve business problems, and in today's market, business leaders demand quicker deeper insights.
The ETL (Extract, Transform, and Load) blueprint that Alteryx provides is superior to tools like Excel, Access, and SQL. It gives the analysts the foundation to help the business move forward without time lags. Moreover, predictive analytic models built within Alteryx can be quickly expressed with visualization tools such as Tableau, Power BI, and QlikView.
The modern approach to business intelligence is in unlocking the power of data to meet strategic decision-making. The data models Alteryx generates are vital to producing a normalized data structure, as it's not about how much data can be processed but about how much data makes a meaningful impact. The executive layout reports can be easily created in Alteryx, all within the reporting Tool Palette, which we will thoroughly cover in Chapter 7, Creating Data-Driven Custom Reports.
Data can help you drive towards the objective view of how to seize opportunities and meet your data-driven goals. Meeting such goals is important for the mission of an organization. The data analysis possibilities are limitless when the focus is on core analytical outcomes. Now that we understand what Alteryx can produce, let's begin building a culture of self-service analytics by going through what's inside the Alteryx Designer.
The main menu includes the File, Edit, View, Options, and Help dropdowns.
Let's view the available Main Menu drop-down selections:
- File: New Workflow, Open Recent, Open Workflow, Open Autosaved Files, Save, Save As, Print, Print Settings, and Exit
- Edit: Undo, Redo, Cut, Copy, Paste, and Delete
- View: Toolbar, Tool Palette, Overview, Results, Configuration, Interface Designer, and Find Tool
- Options: Run Workflow, Schedule Workflow, View Schedules, Run Analytic Apps, Export Workflow, Activate License Key, Manage Licenses, User Settings, Advanced Options, and Download Predictive Tools
- Help: Alteryx Help, What's New, Getting Started, Sample Workflows, Community, Check for Updates, Alteryx Downloads, and About
The toolbar is where we can open a new or an existing workflow, save a workflow, copy, cut, paste, undo, redo, add a workflow to a schedule, zoom in, zoom out, and run a workflow.
All tools in Alteryx appear at the top within different tool palettes. They are divided into groups based on their function:
Once you open Alteryx Designer, you are presented with a blank canvas. This is where you build your process to transform and analyze your data with a set of tools:
The Canvas section can be used to set the layout direction, either Horizontal or Vertical. We'll be using the horizontal layout throughout this book. Annotations drop down can be selected to Hide, Show, or Show w/ Tool Names.
The Connection Progress will show the downstream processing size and record count. This can be selected to Hide, Show, or Show Only When Running.
Workflow section provides engine information and can be used to set the type of workflow: Standard, Analytic App, or Macro. We will cover more details on these types of workflows in the upcoming chapters.
Runtime section allows for memory usage settings, location of temporary files, limiting conversion errors, and different options that will help in creating an efficient workflow.
Events section can be used for documenting events and sending notifications by email.
Meta Info section allows setting custom demographics to your workflow.
The following table lists the shortcuts that can be used to show and hide tools and navigate around the canvas.
The Undo and Redo and Copy and Paste shortcuts:
The Select and Align Tools shortcuts:
The Show and Hide Tools, Windows, Run, Open, Save, and Switch Workflows shortcuts:
The tools are organized into tool categories called Tool Palettes. This is quite helpful when building a workflow, as viewing a category at a glance quickly facilitates suitable workflow development. For instance, to build a workflow that is focused primarily on data cleansing and renaming of fields, swiftly select the
Preparation Tool Palette to use the applicable tools for the workflow. We'll explore more on these types of tools and how to best utilize them in Chapter 3, Data Preparation and Blending. In this section, our goal is to add and remove tool palettes and pin them, so you can easily access them, which will help streamline your workflow.
Let's see how to select the tool categories to view the tools available. Select the Add/Remove Tools icon
next to categories. The
Configure Tool Palette window will appear, allowing you to:
- Select a configured
- Show/hide tool categories by selecting or deselecting various tool categories on the left side
- Select a Tool Palette on the left side, and then on the right side select/deselect tools to show/hide the tools
The following snapshot shows the Configure Tool Palette window:
A tool category can be locked by right clicking on a tool palette and selecting
Pin [Category]. In this case, the
Spatial category will be pinned. The unlocked categories will remain to the right of the Favorites tool category, which by default is automatically pinned:
To unpin a tool category and return it to its original position, right click on the tool category and select
Unpin [Category] or
Unpin All Groups.
The most frequently used tools in a workflow are in the
Favorites category. Assigning a tool as a favorite will be helpful in building your workflow. Furthermore, create your personal tool category consisting of multiple tools from different tool categories. Alteryx has preconfigured 13 tools under the
Favorites category that are used most often by many users.
Favorites tools consist of: Browse, Filter, Formula, Input, Join, Output, Sample, Select, Sort, Summarize, Comment, Text Input, and Union. It is vital to get used to using these tools, since they are widely used in workflows.
There are currently over 200 tools available in Alteryx, and becoming familiar with the Favorites tools will be the first building block to building an effective production workflow.
The Alteryx drag-and-drop interface allows for a seamless repeatable workflow to rapidly process and analyze data. The chapter kicked off with installing Alteryx Designer to begin building a successful workflow. You've taken the first steps towards understanding how Alteryx works behind the scenes, through the architecture of the Alteryx Engine and Alteryx Service. These two key components are horsepower for data processing by managing and running jobs. Along the way, you learned the foundation of Alteryx Designer and its workflow configurations by understanding data preparation and data blending. This can be quickly processed within an analytic workflow to deliver insights in hours, not weeks. You also got acquainted with the Tool Palette, consisting of multiple tools grouped in categories, which can be added or removed based on the workflow design you set out to achieve. You also learned that having the Favorites tools will help you expedite your workflow development.
In the next chapter, we'll explore how to develop an efficient workflow by resource and design. You will learn the best practices around resource optimization, speed processing, and utilizing the performance profiling to identify potential gaps in efficiently processing data. We'll go through how to connect to data and what type of connections can be made, and get familiar with a variety of Alteryx file types. You will be prepared to develop optimal workflows and gain deeper insights within hours, not weeks.