Reader small image

You're reading from  Microsoft SQL Server 2012 with Hadoop

Product typeBook
Published inAug 2013
Reading LevelIntermediate
PublisherPackt
ISBN-139781782177982
Edition1st Edition
Languages
Right arrow
Author (1)
Debarchan Sarkar
Debarchan Sarkar
author image
Debarchan Sarkar

Debarchan Sarkar is a Microsoft Data Platform engineer. He specializes in the Microsoft SQL Server Business Intelligence stack. Debarchan is a subject matter expert in SQL Server Integration Services and delves deep into the open source world, specifically the Apache Hadoop framework. He is currently working on a technology called HDInsight, which is Microsoft's distribution of Hadoop on Windows. He has authored various books on SQL Server and Big Data, including Microsoft SQL Server 2012 with Hadoop, Packt Publishing, and Pro Microsoft HDInsight: Hadoop on Windows, Apress. His Twitter handle is @debarchans.
Read more about Debarchan Sarkar

Right arrow

Chapter 4. Creating a Data Model with SQL Server Analysis Services

Apache Hadoop brings flexibility and scale to an extent which is not possible in the traditional RDBMS-based data warehousing systems for example, SQL Server. On top of the core Hadoop, Hive acts as a data warehouse to provide a logical schema over your HDFS data, which allows ad hoc query mechanisms to work with large datasets. However, query execution, performance, and turnaround times are often decisive factors for most common BI implementation scenarios. Query responses from Hadoop and Hive are predominantly batch operations, which are designed and expected to be time consuming and no-way close to real time. But there are scenarios that businesses demand, where they need real-time answers to their queries.

You could import data from Hadoop to SQL Server Analysis Services (SSAS) by using Hive Query Language (HQL). Analysis Services can then take it up from there and provide real-time insights and powerful reporting on the...

Configuring the SQL Linked Server to Hive


Since Hive tables are exposed only through ODBC, there is no direct way to connect an Analysis Services database to Hadoop as Analysis Services can only connect to OLE DB compliant data sources. To address this, the only way is to create a Linked Server in a SQL Server instance using Hive ODBC driver and consume it through OLE DB for ODBC. We will reuse the DSN HadoopOnLinux that we created during our earlier chapters to connect to Hive.

To create the Linked Server, perform the following steps:

  1. Connect to the SQL Server instance using SQL Server Management Studio and expand the Server Objects as shown in the following screenshot:

  2. Right-click on Linked Servers and choose New Linked Server. This should bring up the New Linked Server window as shown in the following screenshot:

  3. You have to fill in the details of the Hive Data source that you would like connect to. In this case, I have named the Linked Server as LINKTOHIVE. You have to choose Microsoft OLE...

Creating an SSAS data model


Once the Linked Server is created on the computer running SQL Server, it is pretty simple to connect Analysis Services to Hive in SQL Server Data Tools. The rest of this chapter will describe the way to import data from a Hive table into a SSAS Multidimensional and Data Mining Project using the view we created by following the steps as mentioned in the previous section.

Perform the following steps to create a SSAS data model:

  1. In SQL Server Data Tools, on the File menu, click on New, and then click on Project. In the New Project dialog box, under Installed Templates, navigate to Business Intelligence | Analysis Services, and then select Analysis Services Multidimensional and Data Mining Project as shown in the following screenshot:

  2. Here, you would need to specify the Name and the Location for your project files. By default, Solution name will be the same as the project name; however, if you want, you can type a different Solution name. I have named my project as Demo...

Summary


In this chapter, you learned how to leverage the Hive ODBC driver to consume Hadoop/Hive from Analysis Services. Since Analysis Services directly do not support ODBC, you have to create a Linked Server to Hive using the OLE DB to ODBC Bridge and in turn consume the Linked Server queries as data source from your Analysis Services Tabular project model. The Hive ODBC driver makes it easy to import data from your Hadoop Hive table into SQL Server Analysis Services multidimensional data models where Business Intelligence tools may be used to view, shape, and analyze the data further.

The next chapter will focus on the self-service BI tools available from Microsoft and how to generate powerful and interactive visualization using them with just a few clicks.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Microsoft SQL Server 2012 with Hadoop
Published in: Aug 2013Publisher: PacktISBN-13: 9781782177982
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Debarchan Sarkar

Debarchan Sarkar is a Microsoft Data Platform engineer. He specializes in the Microsoft SQL Server Business Intelligence stack. Debarchan is a subject matter expert in SQL Server Integration Services and delves deep into the open source world, specifically the Apache Hadoop framework. He is currently working on a technology called HDInsight, which is Microsoft's distribution of Hadoop on Windows. He has authored various books on SQL Server and Big Data, including Microsoft SQL Server 2012 with Hadoop, Packt Publishing, and Pro Microsoft HDInsight: Hadoop on Windows, Apress. His Twitter handle is @debarchans.
Read more about Debarchan Sarkar