You're reading from SQL Server 2017 Machine Learning Services with R.

Product typeBook

Published inFeb 2018

Reading LevelIntermediate

PublisherPackt

ISBN-139781787283572

Edition1st Edition

Languages

SQL

Tools

R Services SQL Server

Concepts

Data Analysis

Authors (2):

Julie Koesmarno

Toma≈æ Ka≈°trun

View More author details

Machine Learning Services with R for DBAs

R integration (along with Python integration in SQL Server 2017) offered a wide range of possibilities that one can use. And the targeted group of people has just increased in terms of people (job roles or departments) using R Services. DBAs (and also SysAdmins) will for sure gain a lot from this. Not only do R and statistics give them some additional impetus for discovering and gaining insights on their captured data, but also they might help them to find some hidden nuggets that they might have missed before. The mixture of different languages-and I am not solely talking about R, but also other languages-for sure bring new abilities to track, capture, and analyze captured data.

One thing is clear, if you have R (any Python) so close to the database, several people can switch from monitoring tasks to predicting tasks. This literally means...

Gathering relevant data

Gathering data - simple as it might be - is a task that needs to be well crafted. There are a few reasons for that. The first and most important is that we want to gather data in a way that will have minimum or zero impact on the production environment. This means that the process of collecting and storing data should not disturb any on-going process. The second important thing is storage. Where and how do you want to store the data and the retention policy of the stored data? At the beginning, this might seem a very trivial case, but over time, storage itself will play an important role. The third and also utterly important thing is which data you want to gather. Of course, we all want to have smart data present, that is, having all the data relevant for solving or improving our business processes. But in reality, gathering smart is neither that difficult...

Exploring and analyzing data

In a similar way, gathering data using event features can give you a rich way to a lot of system information data. Deriving from the previous sample, with the following demo, we will see how measures of a server can be used for advanced statistical analyses and how to help reduce the amount of different information, and pin-point the relevant measures. A specific database and a stage table will be created:

CREATE DATABASE ServerInfo;
GO
    
USE [ServerInfo]
GO
    
DROP TABLE IF EXISTS server_info;
GO
    
CREATE TABLE [dbo].[server_info]([XE01] [tinyint] NULL, [XE02] [tinyint] NULL,
      [XE03] [tinyint] NULL, [XE04] [tinyint] NULL, [XE05] [tinyint] NULL,
      [XE06] [tinyint] NULL, [XE07] [tinyint] NULL, [XE08] [tinyint] NULL,
      [XE09] [tinyint] NULL, [XE10] [tinyint] NULL, [XE11] [tinyint] NULL,
      [XE12] [tinyint] NULL, [XE13] [tinyint...

Creating a baseline and workloads, and replaying

Given the ability to reduce and create new measures that are tailored and adapted to your particular server or environment, now we want to understand how the system is behaving with all the other parameters unchanged (in Latin, ceteris paribus). This is the baseline. And with the baseline, we establish what is normal, or in other words, what the performance is under normal conditions. A baseline is used for comparing what might be or seem abnormal or out of the ordinary. It can also serve as a control group for any future tests (this works well especially when new patches are rolled out an upgrade of a particular environment/server needs to be performed).

A typical corporate baseline would be described as follows over a period of one day (24 hours) in the form of the number of database requests from users or machines:

When all...

Creating predictions with R - disk usage

Predictions involve spotting any unplanned and unwanted activities or unusual system behavior, especially when compared it to the baseline. In this manner, raising a red flag would result in fewer false positive states.

In addition, we always come across disk-size problems. Based on this problem, we will demo database growth, store the data, and then run predictions against the collected data to be able at the end to predict when a DBA can expect disk space problems.

To illustrate this scenario, I will create a small database of 8 MB and no possibility of growth. I will create two tables. One will serve as a baseline, DataPack_Info_SMALL, and the other will serve as a so-called everyday log, where everything will be stored for unexpected cases or undesired behavior. This will persist in the DataPack_Info_LARGE table.

First, create a database...

Summary

Using SQL Server R for any kind of DBA task, as we have seen here, it is not always hardcore statistics or predictive analytics; we might also be some simple statistical understanding underlying the connection and relationships between the attribute's queries, gathered statistics, and indexes. Prognosing and predicting, for example, information from execution plans in order to prepare a better understanding of the query of cover missing index, is a crucial point. Parameter sniffing or a cardinality estimator would also be a great task to tackle along the usual statistics.

But we have seen that predicting events that are usually only monitored can be a huge advantage for a DBA and a very welcome feature for core systems.

With R integration into SQL Server, such daily, weekly, or monthly tasks can be automated to different, before not uses yet, extent. And as such,...

The rest of the chapter is locked

You have been reading a chapter from

SQL Server 2017 Machine Learning Services with R.

Published in: Feb 2018Publisher: PacktISBN-13: 9781787283572

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (2)

Julie Koesmarno

Julie Koesmarno is a senior program manager in the Database Systems Business Analytics team, at Microsoft. Currently, she leads big data analytics initiatives, driving business growth and customer success for SQL Server and Azure Data businesses. She has over 10 years of experience in data management, data warehousing, and analytics for multimillion-dollar businesses as a SQL Server developer, a system analyst, and a consultant prior to joining Microsoft. She is passionate about empowering data professionals to drive impacts for customer success and business through insights.
Read more about Julie Koesmarno

Toma≈æ Ka≈°trun

Toma Katrun is a SQL Server developer and data scientist with more than 15 years of experience in the fields of business warehousing, development, ETL, database administration, and query tuning. He holds over 15 years of experience in data analysis, data mining, statistical research, and machine learning. He is a Microsoft SQL Server MVP for data platform and has been working with Microsoft SQL Server since version 2000. He is a blogger, author of many articles, a frequent speaker at the community and Microsoft events. He is an avid coffee drinker who is passionate about fixed gear bikes.
Read more about Toma≈æ Ka≈°trun

Other recommended products

Related to this chapter

Data Science with SQL Server Quick Start Guide

SQL Server started to fully support data science only with its last two editions. If you are a professional from both worlds, SQL Server and data science, and interested in using SQL Server and Machine Learning Services for their projects, then this is the ideal book for you.

BookAug 2018206 pages

Modern R Programming Cookbook

R is a powerful tool for statistics, graphics, and statistical programming. It is used by tens of thousands of people daily to perform serious statistical analyses. It is a free, open source system whose implementation is the collective accomplishment of many intelligent, hard-working people. There are more than 2,000 available add-ons, and R is a serious rival to all commercial statistical packages. . The objective of this book is to show how to work with different programming aspects of R. The emerging R developers and data science could have very good programming knowledge but might have limited understanding about R syntax and semantics. Our book will be a platform develop practical solution out of real world problem in scalable fashion and with very good understanding.

BookOct 2017236 pages

Hands-On Data Science with SQL Server 2017

Learn how to utilize Microsoft SQL Server with NoSQL concepts for data science challenges. This book will help enhance your knowledge beyond data querying & processing tasks by implementing a data science pipeline. We will implement data science tasks and show how to use them on a day-to-day basis for efficient smart predictive models.

BookNov 2018506 pages

SQL Server 2017 Developer's Guide

This book is your guide to exploring the various developer capabilities offered by SQL Server 2017. Model your data and the complex relationships within it, and integrate SQL Server with R and Python for efficient analytics. The book also covers the performance and troubleshooting aspects to help you develop efficient database applications.

BookMar 2018816 pages

SQL Server 2016 Developer's Guide

This book is designed to get you up to speed with SQL Server 2016, covering the essential concepts and techniques. By the end of this book, you’ll be able to design efficient, high-performance database applications confidently.

BookMar 2017616 pages

Introducing Microsoft SQL Server 2019

Introducing Microsoft SQL Server 2019 takes you through what’s new in SQL Server 2019 and why it matters. After reading this book, you’ll be well placed to explore exactly how you can make MIcrosoft SQL Server 2019 work best for you.

BookApr 2020488 pages

Learn T-SQL Querying

T-SQL is an extension of the SQL language which allows you to tackle advanced querying and query-tuning challenges in SQL Server and Azure SQL Database. This book will be a perfect reference for you to write more efficient T-SQL code to perform simple-to-advanced tasks for data management and data analysis.

BookMay 2019484 pages

SQL Server 2017 Administrator's Guide

This book will give you all the information you need to become an expert database administrator, and master the administrative aspects of SQL Server 2017. From setting up and configuring your SQL Server instance to fine-tuning your database, this extensive guide will teach you the nitty-gritty of SQL Server 2017 administration.

BookDec 2017434 pages

SQL Server 2019 Administrator's Guide

This book will give you all the information you need to become an expert database administrator and master the administrative aspects of SQL Server 2019. From setting up and configuring your SQL Server instance to fine-tuning your database, this extensive guide will teach you the nitty-gritty of SQL Server 2019 administration.

BookSep 2020522 pages

SQL Server 2017 Integration Services Cookbook

SQL Server Integration Services is a tool that facilitates data extraction, consolidation, and loading options (ETL), SQL Server coding enhancements, data warehousing, and customizations. With the help of this book, you’ll gain complete hands-on experience of SSIS 2017’s new features, and design and development improvements including SCD, Profiling, Tuning, and Customizations.

BookJun 2017558 pages

Hands-On Machine Learning with Azure

This book will teach you how advanced machine learning can be performed in the cloud in a very cheap way. You will learn more about Azure ML processes as an enterprise-ready methodology. By the end of this book, you will implement machine learning and artificial intelligence concepts in your model to solve real-world problems.

BookOct 2018340 pages

SQL Server on Linux

Microsoft's launch of SQL Server on Linux has made SQL Server a truly versatile platform across different operating systems and data-types, both on-premise and on-cloud. You will start by understanding how SQL Server can be installed on supported and unsupported Linux distributions. With the help of this book you will be able to setup SQL Server on Linux and understand how SQL Server can be installed and implemented on this open source platform.

BookAug 2017222 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages