Real Time Analytics with SAP HANA

5 (2 reviews total)
By Vinay Singh
    Advance your knowledge in tech with a Packt subscription

  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Kickoff – Before We Start

About this book

SAP HANA is an in-memory database created by SAP. SAP HANA breaks traditional database barriers to simplify IT landscapes, eliminating data preparation, pre-aggregation, and tuning. SAP HANA and in-memory computing allow you to instantly access huge volumes of structured and unstructured data, including text data, from different sources.

Starting with data modeling, this fast-paced guide shows you how to add a system to SAP HANA Studio, create a schema, packages, and delivery unit. Moving on, you’ll get an understanding of real-time replication via SLT and learn how to use SAP HANA Studio to perform this. We’ll also have a quick look at SAP Business Object DATA service and SAP Direct Extractor for Data Load. After that, you will learn to create HANA artifacts—Analytical Privileges and Calculation View. At the end of the book, we will explore the SMART DATA access option and AFL library, and finally deliver pre-packaged functionality that can be used to build information models faster and easier.

Publication date:
October 2015
Publisher
Packt
Pages
226
ISBN
9781782174110

 

Chapter 1. Kickoff – Before We Start

This chapter intends to provide context and background to set the base with which we can manipulate the datasets to be used for data modeling. This section tries to act as a refresher that should help you understand and pick up modeling topics faster in upcoming chapters.

We start the chapter with Structured Query Language (SQL)—how we can use it for controlling and manipulating the SAP HANA database objects and data. Then we move on to create SQLscript and learn how to use it effectively. We will also discuss creation and call of procedure step by step in this chapter, which is a good tool for the upcoming topics. We will end the chapter with a detailed discussion on JOINS and how it can be used for connecting tables in SAP HANA.

After completing this chapter you will be able to:

  • Understand and use SAP HANA SQL statements

  • Create SQLscript and use it

  • Create and call a procedure

  • Connect tables using SAP HANA specific JOINS

 

Introducing SAP HANA SQL


As stated, you will not learn SQL as a whole new concept, but will just revise the traditional SQL concepts at a glance and focus on a few new topics that are of importance from SAP HANA perspective. Our key focus here will be on the SAP HANA SQL script, creating procedures, and learning to create SAP HANA specific JOINS.

Classical SQL

SQL is used to retrieve, store, and manipulate data in the database. SQL can be studied under three subheads:

These subheads are explained as follows:

  • DDL: These statement that are used to define the data: create, alter, drop tables

  • DML: These statements are used to manipulate the data, select, deselect, insert, and update

  • DCL: These statements that are used to control the table, grant, and revoke

The followings are the elements of SQL:

  • Identifiers: These are used to represent names in SQL statements including table/view name, column name, username, role name and so on. There are two types of Identifiers: ordinary and delimited.

  • Data types: These define the characteristics of the data and its value. Data types in SQL are as follows:

    Categories

    Data type

    Numeric

    float, real, integer, decimal, double, tinyint,

    small int, and small decimal

    Large

    blob, clob, nclob, and text

    Binary

    varbinary

    Character string

    varchar, nvarchar, alphanum, and shorttext.

    Date

    time, date, secondtime, and timestamp

  • Expressions: These are clause evaluated to return values. We have different types of expressions in SQL. For example, if…then…..else (case expression) or nested queries (Select (Select ……)).

  • Functions: These are used in expressions for retrieving information from the database. We have a number of functions and data type conversion functions. The number functions take numeric values or alphanumeric/strings with numeric character values and return numeric values, whereas, data type conversion functions are used to convert arguments from one data type to another. For example, to_alphanum, concat, current_date, and so on.

  • Operators: These are used for value comparison, assigning values, or can also be used for calculation. We have different types of operators like Unary, Binary, arithmetic, and string operators to name a few. For example, +, =, subtraction, and or.

  • Predicates: A predicate is specified by combining one or more expressions or logical operators and returning one of the following logical or truth values: true, false, or unknown. Examples are null, in, and like.

In the upcoming chapters, we will learn how to work with SAP HANA studio and open SQL editor, so as to complete the concepts. I will show you how we work with the preceding SQL concepts. For our examples and exercises, we will use the following tables. We will create more tables in further chapters as we progress.

The following table shows you the sales_facts:

PRODUCT_KEY

REGION_KEY

AMOUNT_SOLD

QUANTITY_SOLD

01

100

50000

500

02

200

60000

600

03

300

20000

200

The following table shows you the CUSTOMERS data:

CUSTOMER_KEY

CUST_LAST_NAME

CUST_FIRST_NAME

C1

Mehta

Yatin

C2

Aguirre

Tomas

C3

Huber

Ralf

The following is a REGION table:

REGION_ID

REGION_NAME

SUB_AREA

100

Europe

Germany

200

Asia

Japan

300

US

Northfields

The following table shows you details of the PRODUCT table:

PRODUCT_KEY

PRODUCT_NAME

01

GasKit

02

RubberWasher

Let's see how we can create the preceding tables in SAP HANA:

  1. In SAP HANA studio, right-clicking on your schema (here, HANA_DEMO) will display Open SQL Console; click on it.

  2. We will cover some of the following SQL queries to create the tables:

    Create a schema first, if it hasn't already been created for you—HANA_DEMO; you can choose any name.

    A database schema is the skeleton structure that represents the logical view of the entire database (objects such as tables, views, and stored procedures). It defines how the data is organized and how the relations among them are associated. It formulates all the constraints that are to be applied on the data, whereas Table is one of the objects contained in schema. It is a set of data elements (values) that is organized using a model of vertical columns (which are identified by their name) and horizontal rows:

    CREATE SCHEMA "HANA_DEMO";
    GRANT SELECT ON SCHEMA HANA_DEMO TO _SYS_REPO WITH GRANT OPTION; if you do not run Grant , later when you will activate your views it will give you erros.
    

    The following command creates the SALES_FACTS table:

    CREATE  COLUMN TABLE "HANA_DEMO"."SALES_FACTS"(
    "PRODUCT_KEY" INTEGER NOT NULL,
    "REGION_KEY" INTEGER NOT NULL,
    "AMOUNT_SOLD" DECIMAL NOT NULL,
    "QUANTITY_SOLD" INTEGER NOT NULL,
    PRIMARY KEY ("PRODUCT_KEY","REGION_KEY") );
    

    The following command creates the CUSTOMER table:

    CREATE  COLUMN TABLE "HANA_DEMO"."CUSTOMER"(
    "CUSTOMER_KEY" VARCHAR(8) NOT NULL,
    "CUST_LAST_NAME" VARCHAR(100) NULL,
    "CUST_FIRST_NAME" VARCHAR(30) NULL,
    PRIMARY KEY ("CUSTOMER_KEY ") );
    

    The following command creates the PRODUCTS table:

    CREATE  COLUMN TABLE "HANA_DEMO"."PRODUCTS" (
    "PRODUCT_KEY" INTEGER NOT NULL,
    "PRODUCT_NAME" VARCHAR(50) NULL,
    PRIMARY KEY ("PRODUCT_KEY") );
    

    The following command creates the REGION table:

    CREATE  COLUMN TABLE "HANA_DEMO"."REGION"(
    "REGION_ID" INTEGER NOT NULL,
    "REGION_NAME" VARCHAR(100) NULL,
    "SUB_AREA" VARCHAR(30) NULL,
    PRIMARY KEY ("REGION_ID") );
    

    The following are sample insert queries:

    insert into "<YOUR SCHEMA>"."TABLE NAME" values(columns1,Columns2,..,); 
    insert into "HANA_DEMO"."SALES_FACTS" values(01,100,50000,500); 
    insert into "HANA_DEMO"."PRODUCTS" values(01,'GasKit');
    insert into "HANA_DEMO"."REGION" values(01,'Europe','Germany'); 
    

    Tip

    I am inserting single values, but you can insert or re-run the query with different values or download the Excel file from our website for demo data.

  3. After executing the scripts, you should have three tables created. If there are no tables, try right-clicking on your schema and then refresh it.

    In the following screenshot, you can see the tables we just created under the HANA_DEMO schema:

Tip

We need to Grant schema SELECT rights to _SYS_REPO user.

In SQL, the editor of our schema needs to execute the following command line:

GRANT SELECT ON SCHEMA <YOUR SCHEMA> TO _SYS_REPO WITH GRANT OPTION;
GRANT SELECT ON SCHEMA HANA_DEMO TO _SYS_REPO WITH GRANT OPTION

If we miss this step, an error will occur when you activate your views later.

 

The SAP HANA SQLscript


In the following section, we will learn about the SAP HANA SQLscript and see the additional capabilities it brings along with it.

Why SQLscript?

SQLscript is a collection of extensions in Structured Query Language (SQL). The main motivation for SQLscript is to push data intensive application logic into the database, which was not being done in the classical approach where the application logic is mostly executed in an application layer.

We have the following extensions for SQLscript:

Extension

Usage

Datatype extension (create/drop type)

This allows definition of table type without corresponding tables

Procedural extension (create procedure)

This is an imperative construct to push data intensive logic into the database

Functional extension (create function)

This creates side-effect free scalar or table functions, which can be used to express and encapsulate complex data flows

How different is an SQLscript in SAP HANA from classical SQL queries?

Let's do a comparative study between an SQLscript in SAP HANA and classical SQL queries to find out what the point of differences are, as shown in the following table:

SQLscript in SAP HANA

Classical SQL

Multiple result sets can be returned

Query returns only single result set

More database intensive, codes are executed at DB layer, gives better performance

Limited executions at DB layer resulting in multiple access to and from database, relatively slow performance

Control logics such as if/else and business logics like currency conversion can be easily expressed

SQL queries do not have such features

Gives more flexibility to developer to use imperative and declarative logics together

No such flexibility with SQL queries

Supports local variables for intermediate result sets with implicit types

Globally visible views need to be defined even for intermediate result sets or steps

Parameterization of views is possible

Parameterization of views is not possible

The following figure shows you a graphical comparison of the classical approach and the SAP HANA approach:

 

When should we use SQLscript?


SQLscript should be used in cases where other modeling constructs of SAP HANA, for example, analytic views or attribute views are not sufficient.

 

Procedures


Procedures are reusable processing blocks that are implemented using the SQLscript, which describes a sequence of operations on data passed as input and database tables. It can be created as read-only (without side-effects) or read-write (with side-effects).

Procedures can have multiple input parameters and output parameters (can be scalar or table types).

There are three different ways to create a procedure in HANA:

  • Using the SQL editor (in SAP HANA Studio)

  • Using the Modeler wizard in the modeler perspective (in SAP HANA Studio)

  • Using the SAP HANA XS project in the SAP HANA Development perspective (in SAP HANA Studio), which isn't discussed in this chapter

Creating with the SQL editor (in SAP HANA Studio)

The following syntax is used to create procedure via the SQL editor:

CREATE PROCEDURE {schema.}name 
            {({IN|OUT|INOUT} 
                        param_name data_type {,...})} 
            {LANGUAGE <LANG>} {SQL SECURITY <MODE>} 
            {READS SQL DATA {WITH RESULT VIEW <view_name>}} AS 
BEGIN 
... 
END

Tip

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

The parameters are for:

  • Reads SQL Data: This defines a procedure as read-only.

  • Language: This specifies the implementation. SQLscript is the default language.

  • With result view: This is used to create a column view for the output parameter of the type table.

Let's create a procedure where we will pass discount as the input parameter and get the sales report as the output parameter. We use the same tables that we created previously:

CREATE PROCEDURE HANA_DEMO."PROC_EU_SALES_REPORT"(
            IN DISCOUNT INTEGER,
            OUT OUTPUT_TABLE HANA_DEMO."EU_SALES" )
LANGUAGE SQLSCRIPT SQL SECURITY INVOKER AS
/*********BEGIN PROCEDURE SCRIPT ************/
BEGIN
Pvar1 = SELECT T1.REGION_NAME, T1.SUB_AREA, T2.PRODUCT_KEY, T2.AMOUNT_SOLD
            FROM HANA_DEMO.REGION AS T1
            INNER JOIN
            HANA_DEMO.SALES_FACT AS T2
            ON T1.REGION_KEY = T2.REGION_KEY;

Pvar2 = SELECT T1.REGION_NAME, T1.SUB_AREA, T1.PRODUCT_KEY, T1.AMOUNT_SOLD, T2.PRODUCT_NAME
            FROM :Pvar1 AS T1
            INNER JOIN
            HANA_DEMO.PRODUCT AS T2
            ON T1.PRODUCT_KEY = T2.PRODUCT_KEY;

OUTPUT_TABLE = SELECT SUM(AMOUNT_SOLD) AS AMOUNT_SOLD, SUM(AMOUNT_SOLD - (AMOUNT_SOLD * :DISCOUNT/ 100)) AS NET_AMOUNT,
            PRODUCT_NAME, REGION_NAME, SUB_AREA
            FROM :Pvar2 
            GROUP BY PRODUCT_NAME, REGION_NAME, SUB_AREA;
END;

We can call the previously created procedure with the following CALL statement:

CALL HANA_DEMO."PROC_SALES_REPORT" (8, null);

You can see the created procedure below our schema under the Procedure... folder.

Procedure creation using the wizard

Choose the package in which you want to create the procedure and right-click on it.

A new screen will pop up; fill in the details and click on Confirm:

The SQL console opens with default syntax; we need to put our logic in between BEGIN and END.

The following is a sample logic with which I am creating the Procedure:

On the left-hand side of the screen, you can see the output pane:

Click on it and select New…:

Define the columns which we used in the preceding procedure:

Similarly, perform the same steps for input parameters as well:

Now the procedure is ready to be called via the CALL statement.

Once we build our concept about different views, then one question that will definitely come to our mind is, should we use calculation views (not yet discussed) or procedures. We will discuss this once we have discussed the calculation view in Chapter 5, Creating SAP HANA Artifacts – Analytical Privileges and Calculation Views.

 

JOINS in SAP HANA


To address some specific business cases and have improved execution, SAP HANA introduces some additional JOINS on top of existing SQL JOINS. These SAP HANA specific JOINS are as follows:

  • Referential JOIN

  • Text JOIN

  • Temporal JOIN

  • Star JOIN

  • Spatial JOIN

Let's see the scenarios when we should consider using these SAP HANA specific JOINS :

Type

Scenario / use case

Remarks

Referential JOIN

Facts with matching dimensions only where referential integrity is ensured.

It's the default join type in SAP HANA.

Facts returned are dependent on queried attributes.

Text JOIN

Multi language table.

Needs a language column.

Behaves as the left outer join.

Temporal JOIN

A key date within a validity period.

Acts as a referential join.

Star JOIN

Star schema scenarios.

Needs data organized in a star schema.

All attributes and hierarchies are included.

Spatial join

Geospatial data.

Only available in calculation views.

 

Unions versus JOINS


Unions are used to combine the result set of two or more SELECT statements. It's always tempting to JOIN two analytic views when measures from more than one table are required. This should, however, be avoided for performance reasons. It is more beneficial to use a Union in a calculation view. Technically, a Union is not a JOIN type.

Points to remember:

  • Union is not supported in the attribute or analytical view but can only be used in calculation views.

  • Union with constant values are supported within graphical calculation views and the Union operator can accept 1..N input sources.

  • Script-based calculation views can only accept two input sources at a given time.

  • Do not JOIN analytical views (to be discussed later), as you might have performance issues. Instead, use Union with constant values when working with multiple fact tables.

 

Self-study questions


  1. What are the other JOINS used in classic SQL that are not mentioned in the preceding discussion, and how are they different?

  2. Can you think of use cases where you should use procedure?

 

Summary


With this chapter, we set the base for the book. It was expected that you already knew these topics and the chapter refreshed them for you. We started with the basics of SQL and how to use SAP HANA SQL statements. We progressed to create SQLscript and procedure. Towards the closure of the chapter, you learned about additional JOINS that SAP HANA has to improve business scenarios, and we closed the chapter with a discussion on Union and JOINS.

In the next chapter, we will cover the approach to SAP HANA data modeling and the dos and don'ts while creating data models. You will also learn which kind of view should be created for different types of information.

About the Author

  • Vinay Singh

    Vinay Singh is a data science manager at BASF, Germany. He has over 12 years' experience in data warehousing and BI. Before joining BASF, he worked with multiple companies/customers, including SAP, Adobe Systems, Freudenberg, and T-Systems, which provided him with a good mix of product development and consulting experience. His other publications include Real-Time Analytics with SAP HANA, published by Packt, Manage Your SAP Projects with SAP Activate, also published by Packt, and Creating and Using Advanced DSOs in SAP BW on SAP HANA, by SAP PRESS. He is a visiting research scholar at the National Central University of Taiwan, and a distinguished speaker at various forums.

    Browse publications by this author

Latest Reviews

(2 reviews total)
this is a hands-on book with example for each steps
I really like it and finds it to be very useful
Real Time Analytics with SAP HANA
Unlock this book and the full library for FREE
Start free trial