Reader small image

You're reading from  SQL Server 2017 Machine Learning Services with R.

Product typeBook
Published inFeb 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781787283572
Edition1st Edition
Languages
Right arrow
Authors (2):
Julie Koesmarno
Julie Koesmarno
author image
Julie Koesmarno

Julie Koesmarno is a senior program manager in the Database Systems Business Analytics team, at Microsoft. Currently, she leads big data analytics initiatives, driving business growth and customer success for SQL Server and Azure Data businesses. She has over 10 years of experience in data management, data warehousing, and analytics for multimillion-dollar businesses as a SQL Server developer, a system analyst, and a consultant prior to joining Microsoft. She is passionate about empowering data professionals to drive impacts for customer success and business through insights.
Read more about Julie Koesmarno

Tomaž Kaštrun
Tomaž Kaštrun
author image
Tomaž Kaštrun

Toma Katrun is a SQL Server developer and data scientist with more than 15 years of experience in the fields of business warehousing, development, ETL, database administration, and query tuning. He holds over 15 years of experience in data analysis, data mining, statistical research, and machine learning. He is a Microsoft SQL Server MVP for data platform and has been working with Microsoft SQL Server since version 2000. He is a blogger, author of many articles, a frequent speaker at the community and Microsoft events. He is an avid coffee drinker who is passionate about fixed gear bikes.
Read more about Tomaž Kaštrun

View More author details
Right arrow

R and SQL Server 2016/2017 Features Extended

SQL Server 2016 and 2017 provide a lot of new and improved query performance capabilities, extensibility features, security features, and built-in/native capabilities that are useful for developers, DBAs, and data scientists. These new features and capabilities can be used together with machine learning services in SQL, bringing a powerful data science solution as well as making the life of the developer/data scientist much easier.

This chapter will walk you through a few unique scenarios to show the combined power of R and other built-in capabilities in SQL Server. These scenarios include JSON built-in capabilities to show how we work with IoT data, PolyBase to illustrate beyond relational data sources, and a large amount of data with the ColumnStore index. We will dive into these scenarios and produce data visualization and predictive...

Built-in JSON capabilities

In this scenario, we will use the EMS incidents by month from the official city of Austin open data portal (https://data.austintexas.gov/Public-Safety/EMS-Incidents-by-Month/gjtj-jt2d). The data essentially contains incident counts, broken down by location and priorities for the city of Austin and Travis County incidents, and the percentage of on-time compliance.

The following are the prerequisites to get started:

  1. Download the data from https://data.austintexas.gov/resource/bpws-iwvb.json to a local path, such as C:\Temp\bpws-iwvb.json.
  2. Grant read access to the directory; for example:
Figure 10.1 Granting access to C:\Temp for MS SQL Server
  1. For ease of R visualization, we will use SQL Operations Studio. You can download SQL Ops Studio from: https://docs.microsoft.com/en-us/sql/sql-operations-studio/download.

The following is an excerpt of a JSON...

Accessing external data sources using PolyBase

PolyBase allows your SQL Server instance to access data outside of the server/database using T-SQL. In SQL Server 2016, you can run queries on external data in Hadoop or import data from Azure Blob Storage:

Figure 10.8: PolyBase concept (source: https://docs.microsoft.com/en-us/sql/relational-databases/polybase/polybase-guide)

In this section, we'll use a similar dataset as in the previous section, represented as CSV files in Azure Blob Storage. These CSV files represent the EMS incidents, which will be exposed as an external table in SQL Server. The goal for this walk-through is to understand seasonality and the trending of EMS incidents. We will use R in the SQL Server to do this and view the chart in SQL Operations Studio.

The following are the prerequisites to get started:

  1. The SQL Server instance installed with PolyBase...

High performance using ColumnStore and in memory OLTP

SQL Server 2016 R Services and SQL Server 2017 ML Services provide advanced analytical capabilities that can also be applied to optimized tables. In this section, we will walk you through comparisons on how to use R Services with:

  • A table with a primary key
  • A table with a clustered ColumnStore index
  • A memory-optimized table
  • A memory-optimized table with a clustered ColumnStore index
For more optimization tips and tricks with SQL Server and machine learning, visit:
https://azure.microsoft.com/en-us/blog/optimization-tips-and-tricks-on-azure-sql-server-for-machine-learning-services/

Testing rxLinMod performance on a table with a primary key

The following steps will test...

Summary

SQL Server 2016 and SQL Server 2017 are packed with useful features, from JSON, PolyBase, to high performance features, such as the ColumnStore index and memory-optimized tables. Most of these features are compatible and can enhance your advanced analytics experience. In this chapter, you have learned the steps for creating forecast and seasonal trends of EMS incidents in Austin and Travis County. We used both JSON data source and CSV files in Azure Blob Storage. Finally, you also have learned how to make use of ColumnStore and memory-optimized tables to improve performance of R Services.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
SQL Server 2017 Machine Learning Services with R.
Published in: Feb 2018Publisher: PacktISBN-13: 9781787283572
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Julie Koesmarno

Julie Koesmarno is a senior program manager in the Database Systems Business Analytics team, at Microsoft. Currently, she leads big data analytics initiatives, driving business growth and customer success for SQL Server and Azure Data businesses. She has over 10 years of experience in data management, data warehousing, and analytics for multimillion-dollar businesses as a SQL Server developer, a system analyst, and a consultant prior to joining Microsoft. She is passionate about empowering data professionals to drive impacts for customer success and business through insights.
Read more about Julie Koesmarno

author image
Tomaž Kaštrun

Toma Katrun is a SQL Server developer and data scientist with more than 15 years of experience in the fields of business warehousing, development, ETL, database administration, and query tuning. He holds over 15 years of experience in data analysis, data mining, statistical research, and machine learning. He is a Microsoft SQL Server MVP for data platform and has been working with Microsoft SQL Server since version 2000. He is a blogger, author of many articles, a frequent speaker at the community and Microsoft events. He is an avid coffee drinker who is passionate about fixed gear bikes.
Read more about Tomaž Kaštrun