Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
SQL Server 2017 Machine Learning Services with R.

You're reading from  SQL Server 2017 Machine Learning Services with R.

Product type Book
Published in Feb 2018
Publisher Packt
ISBN-13 9781787283572
Pages 338 pages
Edition 1st Edition
Languages
Authors (2):
Julie Koesmarno Julie Koesmarno
Profile icon Julie Koesmarno
Tomaž Kaštrun Tomaž Kaštrun
Profile icon Tomaž Kaštrun
View More author details

Table of Contents (12) Chapters

Preface Introduction to R and SQL Server Overview of Microsoft Machine Learning Server and SQL Server Managing Machine Learning Services for SQL Server 2017 and R Data Exploration and Data Visualization RevoScaleR Package Predictive Modeling Operationalizing R Code Deploying, Managing, and Monitoring Database Solutions containing R Code Machine Learning Services with R for DBAs R and SQL Server 2016/2017 Features Extended Other Books You May Enjoy

Deploying, Managing, and Monitoring Database Solutions containing R Code

Operationalizing R code in a SQL Server database means that data scientists/database developers can also leverage productionizing data science solutions as part of Database Lifecycle Management (DLM). This includes the following:

  • Checking in R code as part of a SQL Server database project into a version control
  • Adding the stored procedures for the data science solution as part of SQL Server unit tests
  • Integrating the data science solution into the Continuous Integration/Continuous Delivery (CI/CD) process
  • Monitoring performance of the data science solution in the production on a regular basis

In this chapter, we will be using SQL Server Data Tools (SSDT) in Visual Studio 2017 and Visual Studio Team Services to perform this DLM workflow. However, the underlying concept can be applied to any other CI/CD platform...

Integrating R into the SQL Server Database lifecycle workflow

Earlier in Chapter 7, Operationalizing R Prediction Models, we discussed how to create an R project in Visual Studio 2017. We also talked about integrating R code as part of sp_execute_external_script in SQL Server. Here, we will revisit Visual Studio 2017, specifically in the context of integrating R code in sp_execute_external_script as part of a SQL Server Database Project, and holistically as part of the database lifecycle workflow.

Preparing your environment for the database lifecycle workflow

In this section, we will discuss the stages of the database lifecycle workflow and the tools that we will be using. For each of the stages in the workflow, there will...

Prerequisites for this chapter

Creating the SQL Server database project

In this section, we will walk you through how to create a database project in VS2017.

  1. In VS2017, click on File | New Project.
  2. Choose SQL Server from Installed on the left pane and click on the SQL Server Database Project template...

Using version control

From Visual Studio, we can check in the solution and manage changes in version control. In this specific instance, we are using VSTS to check in. It is assumed that you have created a project in VSTS.

Here are the prerequisites for the rest of this section:

  1. A VSTS project: To set up a new VSTS project, simply go to: https://www.visualstudio.com/team-services/.

The URL for a VSTS project should follow this format:
https://<your account>.visualstudio.com/<VSTS Project>

The VSTS project referred to in this chapter is named SQL Server R Services Book. So, the URL in my case is https://mssqlgirl.visualstudio.com/SQL%20Server%20R%20Services%20Book

  1. The VSTS project is mapped to a local folder.

The local folder here mapped to the project is C:\VSTS\SQL Server R Services Book. Earlier in this chapter, we created the SQL Server database solution...

Setting up continuous integration

The main idea of continuous integration (CI) is to perform builds that are automated based on one or more triggers. One of the triggers to perform a build is a check-in event. Another one could be a scheduled build. Choosing which trigger is appropriate depends on various factors, such as the complexity of the project and the culture of the team. In this section, because the project is small, we are going to automate the build triggered by check-ins. We will also add tests as part of the build.

VSTS is a good platform to automate builds, deployments for testing, and monitoring. In this section, we will configure a build definition and schedule a continuous integration in VSTS.

Ensure that the Visual Studio solution, including the SQL Server database project and the SQL Server Unit Test project, are built successfully.

Figure 8.16 shows the SQL...

Setting up continuous delivery

Continuous delivery aims to ensure that we can deploy good builds to the desired environment. This could mean the UAT environment or the Production environment. In this section, we will use VSTS to implement continuous delivery:

  1. In VSTS, go to the SQL Server R Services Book project.
  2. Navigate to Build and Release | Release from the top menu.
  3. Click on + | New Definition.
  4. Review the Select Template pane. From here, you can choose from many options, including Run Automated Tests from Test Manager. This option is highly recommended for regularly checking the accuracy of your existing model, which will be discussed in the next step of what the manual process entails. For now, let's choose Empty and click on Add.
  5. On the top title where it says All definitions | New Release Definition, click on the pencil icon to edit the name to UAT Release.
  6. Let&apos...

Monitoring the accuracy of the productionized model

In Chapter 6, Predictive Modeling, we discussed a number of predictive modeling examples. The model(s) created is/are based on trained data. In a real-world scenario, new data keeps coming in, for example, online transactions, taxi cab transactions (remember the earlier NYC taxi example), and air flight delay predictions. Therefore, the data model should be checked regularly to ensure that it is still satisfactory and that there is no other better model that could be generated for it. With the latter, a good data scientist would continuously be asking at least four of these questions:

  1. Is there a different algorithm to consider due to changes of the data?

For example, if the current model is using logistic regression (rxLogit), would the decision tree algorithm more accurate (rxDTree) either due to the size or due to changes...

Summary

Visual Studio 2017 is a powerful IDE for data scientists/developers to manage their code, unit testing, and version control. Combined with Visual Studio Team Services, they form a complete toolkit to execute Database Lifecycle Management, which can also be easily adapted to DevOps practices. This chapter describes in detail how you can integrate SQL Server Machine Learning Services with R in SQL Server Database projects, DevOps practices, and CI/CD workflows. Finally, you have also learned how to monitor a Prediction Model accuracy over time.

In the next chapter, we'll discuss how DBAs can also take advantage of Machine Learning Services with R.

lock icon The rest of the chapter is locked
You have been reading a chapter from
SQL Server 2017 Machine Learning Services with R.
Published in: Feb 2018 Publisher: Packt ISBN-13: 9781787283572
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}