Microsoft SQL Server 2012 with Hadoop

Microsoft SQL Server 2012 with Hadoop
eBook: $23.99
Formats: PDF, PacktLib, ePub and Mobi formats
save 15%!
Print + free eBook + free PacktLib access to the book: $63.98    Print cover: $39.99
save 6%!
Free Shipping!
UK, US, Europe and selected countries in Asia.
Also available on:
Table of Contents
Sample Chapters
  • Integrate data from unstructured (Hadoop) and structured (SQL Server 2012) sources
  • Configure and install connectors for a bi-directional transfer of data
  • Full of illustrations, diagrams, and tips with clear, step-by-step instructions and practical examples

Book Details

Language : English
Paperback : 96 pages [ 235mm x 191mm ]
Release Date : August 2013
ISBN : 1782177981
ISBN 13 : 9781782177982
Author(s) : Debarchan Sarkar
Topics and Technologies : All Books, Big Data and Business Intelligence, Enterprise Products and Platforms, Enterprise

Table of Contents

Chapter 1: Introduction to Big Data and Hadoop
Chapter 2: Using Sqoop – The SQL Server Hadoop Connector
Chapter 3: Using the Hive ODBC Driver
Chapter 4: Creating a Data Model with SQL Server Analysis Services
Chapter 5: Using Microsoft's Self-Service Business Intelligence Tools
  • Chapter 1: Introduction to Big Data and Hadoop
    • Big Data – what's the big deal?
    • The Apache Hadoop framework
      • HDFS
      • MapReduce
        • NameNode
        • Secondary NameNode
        • DataNode
        • JobTracker
        • TaskTracker
      • Hive
      • Pig
      • Flume
      • Sqoop
      • Oozie
      • HBase
      • Mahout
    • Summary
    • Chapter 2: Using Sqoop – The SQL Server Hadoop Connector
      • The SQL Server-Hadoop Connector
        • Installation prerequisites
          • A Hadoop cluster on Linux
          • Installing and configuring Sqoop
          • Setting up the Microsoft JDBC driver
      • Downloading the SQL Server-Hadoop Connector
      • Installing the SQL Server-Hadoop Connector
      • The Sqoop import tool
        • Importing the tables in Hive
    • The Sqoop export tool
      • Data types
    • Summary
      • Chapter 3: Using the Hive ODBC Driver
        • The Hive ODBC Driver
        • SQL Server Integration Services (SSIS)
          • SSIS as an ETL – extract, transform, and load tool
        • Developing the package
          • Creating the project
          • Creating the Data Flow
          • Creating the source Hive connection
          • Creating the destination SQL connection
          • Creating the Hive source component
          • Creating the SQL destination component
          • Mapping the columns
          • Running the package
        • Summary

            Debarchan Sarkar

            Debarchan Sarkar (@debarchans) is working with Microsoft Escalation services and has written books on SQL Server BI and big data. His total tenure at Microsoft is 6 years, and he was with the SQL Server BI team before diving deep into big data and the Hadoop world. He is an SME in SQL Server Integration Services and is passionate about the present-day Microsoft self-service BI tools and data analysis, especially social-media brand sentiment analysis. Debarchan is from Calcutta, India, and is presently located in Bangalore, India, working in Microsoft's Global Technical Support Center. He owns and maintains his big data learning group on Facebook and has been a speaker at several Microsoft internal and external community events. Apart from his passion for technology, he is interested in visiting new places, listening to music, meeting new people, and learning new things because he is a firm believer that "Known is a drop; the unknown is an ocean".
            Sorry, we don't have any reviews for this title yet.

            Submit Errata

            Please let us know if you have found any errors not listed on this list by completing our errata submission form. Our editors will check them and add them to this list. Thank you.

            Sample chapters

            You can view our sample chapters and prefaces of this title on PacktLib or download sample chapters in PDF format.

            Frequently bought together

            Microsoft SQL Server 2012 with Hadoop +    Mastering Object-oriented Python =
            50% Off
            the second eBook
            Price for both: $34.95

            Buy both these recommended eBooks together and get 50% off the cheapest eBook.

            What you will learn from this book

            • Use the Native SQOOP Connector for data movement between SQL Server 2012 and Hadoop
            • Configure and use the Hive ODBC driver to enable any ODBC compliant client to consume Hadoop data
            • Create ETL solutions and automate data movement jobs between SQL Server 2012 and Hadoop using SQL Server Integration Services
            • Provide powerful reporting on the integrated data with just a matter of a few clicks using Microsoft self-service BI tools
            • Merge structured and unstructured data together in a common warehouse for analysis, which is essential

            In Detail

            With the explosion of data, the open source Apache Hadoop ecosystem is gaining traction, thanks to its huge ecosystem that has arisen around the core functionalities of its distributed file system (HDFS) and Map Reduce. As of today, being able to have SQL Server talking to Hadoop has become increasingly important because the two are indeed complementary. While petabytes of unstructured data can be stored in Hadoop taking hours to be queried, terabytes of structured data can be stored in SQL Server 2012 and queried in seconds. This leads to the need to transfer and integrate data between Hadoop and SQL Server.

            Microsoft SQL Server 2012 with Hadoop is aimed at SQL Server developers. It will quickly show you how to get Hadoop activated on SQL Server 2012 (it ships with this version). Once this is done, the book will focus on how to manage big data with Hadoop and use Hadoop Hive to query the data. It will also cover topics such as using in-memory functions by SQL Server and using tools for BI with big data.

            Microsoft SQL Server 2012 with Hadoop focuses on data integration techniques between relational (SQL Server 2012) and non-relational (Hadoop) worlds. It will walk you through different tools for the bi-directional movement of data with practical examples.

            You will learn to use open source connectors like SQOOP to import and export data between SQL Server 2012 and Hadoop, and to work with leading in-memory BI tools to create ETL solutions using the Hive ODBC driver for developing your data movement projects. Finally, this book will give you a glimpse of the present day self-service BI tools such as Excel and PowerView to consume Hadoop data and provide powerful insights on the data.


            This book will be a step-by-step tutorial, which practically teaches working with big data on SQL Server through sample examples in increasing complexity.

            Who this book is for

            Microsoft SQL Server 2012 with Hadoop is specifically targeted at readers who want to cross-pollinate their Hadoop skills with SQL Server 2012 business intelligence and data analytics. A basic understanding of traditional RDBMS technologies and query processing techniques is essential.

            Code Download and Errata
            Packt Anytime, Anywhere
            Register Books
            Print Upgrades
            eBook Downloads
            Video Support
            Contact Us
            Awards Voting Nominations Previous Winners
            Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
            Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software