In this chapter, we introduce the relational SQL Server products that you can use in Azure to store and process transactional data in a relational format, which is to say data stored in tables and columns. To get started on this discussion, there are a few terms and concepts that are important to understand.
The concepts of platform as a service (PaaS) and infrastructure as a service (IaaS) can be confusing and are used constantly when discussing services that allow you to build software on a cloud service. The fundamental distinction between the two lies in how managed the offering is. A service being managed means that the provider handles some amount of the operation (or management) of the service. When you install an operating system and SQL Server on your on-premises computer, you manage the entire hardware and software infrastructure yourself. This starts with making sure the server is plugged in and everything else moves from there. PaaS and IaaS both indicate managed services because the resources you use will be managed to some extent.
PaaS indicates that you are getting a platform to work with, and more of the management tasks such as software patching, performance tuning, backups, and fault tolerance will be handled by Azure. The goal is to let you focus on providing business value and leave the day-to-day operations to Microsoft. How much of the management is done for you is based on the features of the offering, but as an example, each of the PaaS offerings we will introduce will handle backups of your data without you or your customer thinking too much about it, until you find the need to restore your data.
The IaaS model primarily provides management of the hardware and network. You never need to, and never can, touch any of the physical resources or even access the location of the server. Just like when you get a new computer, an IaaS server may have software pre-installed for you, but once you take over the computer, managing and configuring the software and hardware is up to you. The Azure IaaS platform for SQL Server does include tools to help you automate the management of the software, giving you some of the characteristics of the PaaS model, but these tools will not be as controlled in the IaaS model as they will in a PaaS-model server.
A managed database service does not indicate that the Azure platform will change the meaning of any data or code you have written; in fact, it's quite the opposite. This is still part of what your organization needs to do. No changes will be made to your system that change the meaning of the structures you create. You will simply need to be less concerned with day-to-day processes that are common to pretty much every organization.
This book largely focuses on the IaaS offering using SQL Server on Azure Virtual Machines, though we will introduce the PaaS offerings for contrast. The rest of this chapter will introduce the Azure SQL offerings, the newest features in SQL Server 2019, and the value of using SQL Server in its IaaS configuration.
The Azure SQL portfolio
Microsoft Azure SQL is a modern SQL portfolio of offerings for storing relational data as a service. It is powered by the industry-leading SQL Server engine, which has evolved greatly over the years, retaining backward compatibility at the code level and continuing to provide monumental leaps in performance and storage capacity version over version. Some of the Azure SQL offerings are evergreen, meaning the offering is always up to date with the latest updates and patches. Because each offering is essentially based on the same SQL Server code, database administrators (DBAs) and developers can often use the tools and resources they are already familiar with from their past SQL Server experience, including graphical, command-line, and T-SQL–based tools, for much of the work they need to do.
For many organizations, having to build, house, and manage hardware and software is a large burden for a variety of reasons, but in most cases, cost and security are the most important. Beyond the easily quantified cost of purchasing hardware and software, there are costs in finding qualified persons to manage the hardware, operating systems (OSes), and database platform, all before considering the day-to-day operations such as tuning queries and executing backups.
Using the Azure platform, upgrades to VM type and size can be done by a simple UI operation rather than buying new hardware, configuring it, and migrating all of your data onto it (that process may still occur, but your experience is often checking a box or sliding a slider on a web page and letting the automation do the work for you.)
The second reason is the most important: security. What all the data breaches in recent history have demonstrated is that most databases are accessible from the internet in some manner. Having the management and security of your infrastructure in the hands of a company such as Microsoft pushes the technology burdens of a very large part of securing your data onto them. You can take comfort in the fact that the entire Azure business model rests on the security of all its customers' data, including yours. It will still be your responsibility to build proper security principals with adequate passwords and two-factor authentication, as no security will stop a user with proper credentials from accessing your online resources.
Azure has multiple other database management systems in the Data Platform portfolio for different types of database needs, including Cosmos DB1, and Synapse Analytics2, to name a few. Here's a full list of current products in the Azure family of services3. Note that Synapse Analytics uses relational tables, but it is focused on large-scale, specialized analytics. This chapter focuses solely on the relational SQL Server–based offerings (Azure also has relational database offerings such as MySQL4, PostgreSQL5, and MariaDB6).
The Azure SQL portfolio provides a consistent and unified management experience spanning three SQL Server offerings in Azure, each with its own targeted use cases. Almost any of the offerings will be perfectly acceptable to provide support for transaction processing (commonly referred to as online transaction processing (OLTP)) as well as most analytics (reporting) scenarios. Each offering is positioned to provide different levels of service, as we will discuss. The three offerings we will discuss specifically are:
- SQL Server on Azure Virtual Machines.
- Azure SQL Managed Instance.
- Azure SQL Database.
There is one additional method of deployment using containers that we will not be specifically covering in this book. The container method is very similar in functionality to the VM deployment, except the VM is replaced by a lightweight, GUI-free container running Linux or Windows using Kubernetes or Docker.
In the following sections, we will introduce each of these offerings to provide you with an overview of their strengths and ideal usages. Each of the offerings provides you with the same SQL Server relational engine internals for storing and querying data using T-SQL. Each will have the same data manipulation language (DML), with only minor differences in data definition language (DDL) due to physical implementation differences. While some of the management tools and methods supported by each platform are different, the primary difference is based on how managed the service is.
SQL Server on Azure Virtual Machines
SQL Server on Azure Virtual Machines7 (or Azure SQL VMs for short) and indeed any of the Azure Virtual Machines offerings are considered IaaS. This is because Microsoft manages the hardware infrastructure, but you manage the software. As the DBA managing the server, it is generally no different than managing SQL Server on a computer that resides in your own server room.
When you create an Azure SQL VM, you are given the opportunity to use a pre-built VM image that has a supported version of SQL Server pre-installed, or you can choose to bring your own media to install from. There are licensing differences and benefits to both models, but we will not even begin to try to cover licensing in this book. Here are some more insights offered by Microsoft on Azure VM licensing8.
Whether you use a pre-built image or bring your own software, the VM can take advantage of some automation by using the SQL Server IaaS Agent Extension (not currently available on an Azure VM running Linux at the time of this book's publishing), which provides automated backup and patching capabilities, as well as configuration assistance with Azure Key Vault integration to store encryption keys outside of SQL Server. You are also fully able to use any method you wish for these tasks, including SQL Server Maintenance Plans or even third-party backup scripts and tools. Some additional tooling may be necessary in any case, because backups are just part of the regular upkeep needed for a healthy database that is even lightly used. The Agent Extension (along with several other features) is enabled automatically when using a pre-built image, or by registering9 your VM with the SQL VM resource provider.
It is important to catch the distinction between automation and a managed service. Automation provides tools that you can use to make managing your server easier. With the PaaS model of the next two Azure offerings, you don't need to monitor to see whether backups have failed, nor do you even need to do anything to ensure that your server is backed up. The Azure platform management system backs up your server based on the settings you choose (and you can't even accidentally choose not to back up at all either). With IaaS, only the hardware is truly managed by Microsoft. It is your responsibility to back up your databases and make sure those backups can be restored, even when using the SQL Server IaaS Agent Extension.
Managing and supporting are two different concepts. A supported service means the host will help you if the software is not working properly. A managed service will have the host in charge of making sure things work properly based on your configuration.
An SQL Server VM gives you a highly compatible method to lift and shift many workload types to the cloud. This includes transactional workloads capturing customer orders or business intelligence workloads using analytics features such as Machine Learning Services, Reporting Services, Analysis Services, and so on. This is because the Azure VM presents itself as very much the same as your on-premises hardware, the only real difference being how you configure networking and security over the internet to work with your local security infrastructure. For SQL servers that use SQL authentication, the application will require little, if any, change, but using Active Directory will require some configuration. (Using SQL-based authentication is not considered as good as using Active Directory integration for many reasons. Chapter 3 will cover security, including integrating with your existing Active Directory.)
One major choice you have, beginning with SQL Server 2017 and continuing into SQL Server 2019, is which OS to choose. Beyond Windows Server, SQL Server will run on Red Hat Enterprise Linux (RHEL), SUSE Linux Enterprise Server (SLES), and Ubuntu. This allows SQL Server to be run on any OS used by an organization that's heavily invested in open-source software (OSS), while still providing virtually all the database features of the Windows version.
Before we dig deeper into the topic surrounding OS choice for virtual machines, there are still two configurations of SQL in Azure that we want to cover, because they offer specific benefits that the IaaS platform may not.
Azure SQL Managed Instance
The Azure SQL Managed Instance deployment option is a PaaS database offering that sits right in the middle of the Azure SQL offerings in terms of management and is one of two Azure SQL PaaS offerings. It targets scenarios where the customer needs much of the rich functionality of the full SQL Server product but desires the value of the platform-managed services model as well.
One of the biggest differences between the managed instance model and the Azure SQL VM model is that a managed instance will always be on the most up-to-date version of SQL Server (a database can be set to an earlier compatibility level if needed). This means, at the time of this book's writing in early 2020, managed instance–hosted databases provide the user with at least everything that SQL Server 2019 has to offer, up to the latest public cumulative update (CU), including all the new and improved features in SQL Server 2019 (we will provide an overview later in this chapter). Soon after the next version of SQL Server is released, managed instances will be upgraded as part of the offering.
While it is extremely rare that Microsoft removes features from SQL Server (in fact, backward compatibility is an important part of their story), there are frequent changes in how the engine optimizes queries, so be sure to keep up with performance changes in your application. This process of determining and mitigating performance issues after releases is made all the easier by the Query Store feature10.
The managed instance model will automatically maintain backups11 of your data, restorable to a point in time, based on the level of retention you configure. It safeguards your backups from disaster by using read-access geo-redundant storage (RA-GRS).
While backups are part of the managed instance package, geo-replication (that is, maintaining active copies of your data in multiple regions in case of catastrophic failure) is not. The managed instance option provides you with Auto-Failover Groups12 you can configure like in SQL Server 2019. This lets you configure your server to fail over to a different Azure datacenter if desired.
In many ways, a managed instance looks and behaves just like a typical SQL Server, particularly when dealing with DML code, instance security, and tools such as SQL Agent. (SQL Agent is limited in the types of jobs it can schedule, due to a lack of OS and file system access.) There are important considerations, such as the inability to restore a database from a managed instance to a local or VM SQL Server instance or to use SQL Server Integration Services (SSIS) in the same way as you do with a typical instance. (SSIS projects will work using Azure Data Factory13, but the infrastructure of the SSIS DB does not exist natively in the managed instance itself. For more details, you should refer to Microsoft's guide to migrating SSIS packages to Azure SQL Managed Instance14. Any code that needs access to the file system, such as for processing files for import or export, will need to be modified to work with Azure Blob storage15 instead.
Managed instances provide a lot of parity with an up-to-date version of SQL Server, providing a way to lift and shift many workloads to the cloud with a reasonable amount of work. However, there are limitations, such as always being on the most recent version of SQL Server, a lack of access to the file system, and not being able to restore to a local SQL Server instance that may make this untenable for many scenarios, leading to using the IaaS VM offering.
In the next section, we will look at one more Azure SQL offering that goes deeper into the managed aspect of PaaS offerings, even beyond what you have in a managed instance. It may require more adjustment from an existing on-premises environment but offers conveniences in management that leave you to mostly design and develop tables and code.
Azure SQL Database
Azure SQL Database16 (or SQL Database for short) is the fully managed version of SQL Server, falling deeper into the category of PaaS than the managed instance model. Using SQL Database, you are provided with a database container (or containers, in more complex configurations) in which to create tables and coded objects. The T-SQL features of SQL Database may actually be more advanced than what you get with the other offerings, as they often release features to SQL Database first.
There are many configurations to choose from in terms of performance and size, starting small and scaling up to very large database sizes. In fact, using Hyperscale17 for SQL Database, this platform's offerings can currently support up to 100 TB of data, so the amount of space and computing power is not generally a limiting factor.
The SQL Database offering is tailored to cloud applications with the least amount of legacy management dependencies. This does not mean that your configuration must be one simple database, however, and at the time of writing this book, there are three options for how you can set up your SQL Database environment:
- Single database: A single database that can be used to store data. This will feel like an SQL Server instance to the user/administrator, with access to a TempDB and master database. If you have connected to a contained database in SQL Server, it is conceptually similar except the boundaries are far more firm. Compute resources are available via pre-provisioned or serverless options18, ensuring sufficient resources for both consistent and highly unpredictable workloads.
- Elastic Pool: Multiple independent single-database configurations that share the same set of computing resources. This allows you to have low-use and high-use pattern databases in the same pool, where both databases can use up to the maximum resources when needed, but not pay for large amounts of resources dedicated to one database when it is only rarely needed. This is particularly effective when the databases see heavy utilization at different times of the day.
- Database server: A group of single databases and elastic pools, banded together for administrative purposes, for things such as networking, security principals (logins), policies, and so on.
As the consumer, you are getting the SQL Server database as a platform to store data.
Some aspects of the table structures you create on Azure SQL Database may need to be different from the other offerings. One key difference is that every table will require a clustered index to enable the use of replication of your data to redundant storage. (Using clustered indexes on all tables is considered a best practice in most cases anyhow.) You have no control over where your data is located (outside of the region of the world), or the computer the data is located on. This is all done for you.
Just like managed instances, backups are part of the managed package, but additionally, high availability via geo-replication is also supported as part of the management of your database, with very little configuration. Microsoft's guide offers more insights on active geo-replication19.
A feature that is specific to SQL Database that is particularly useful is automated indexing20. Using the automatic index management feature, SQL Database can apply a
DROP INDEX automatically based on what the optimizer has recognized as a given need, then monitor that change to see whether it has helped or harmed performance and adjust accordingly.
The SQL Database offering provides the most management for you but may not exactly match the needs of an organization with a well-established working system in an SQL Server instance.
Now that we have covered these three offerings, in the next section, we will outline some of the important differences between them.
SQL Server in Azure comparisons
All three of the Azure SQL offerings that we have introduced in this section are based on the same SQL Server 2019 database engine, but there are key differences between them. In the following table, we list some of the key differences to remember when considering which option to choose:
Figure 1.1: Azure SQL offerings
In the next section, we will highlight the key new features that were included in SQL Server 2019 that make it a worthwhile upgrade from earlier versions (including from the most recent version before 2019: SQL Server 2017).
SQL Server 2019 highlights
SQL Server 2019 has a lot of new and enhanced features that make it not only a world-class relational database engine, but a world-class data platform. In this section, we will be taking a brief look at what SQL Server 2019 adds on from previous versions, as well as the features that have been added to running SQL Server on a Linux-based platform in SQL Server 2019.
Intelligence over all of your data
Intelligence over all of your data is a phrase that you can find on multiple Microsoft websites describing an important SQL Server feature for integrating your disparate data sources: PolyBase. PolyBase23 is a feature that can be used to virtualize data from external sources including Azure Cosmos DB, Azure Blob Storage, and starting in SQL Server 2019, data in SQL Server, Oracle, Teradata, and MongoDB. Once connected and configured, you can query this data just like any normal relational table (as well as joined with your local relational data).
Additionally, SQL Server 2019 introduces Big Data Clusters24, providing scale-out capabilities with clusters of SQL Server, Apache Spark, and Hadoop Distributed File System (HDFS) data, allowing reading and writing of large quantities of data stored in SQL Server or big data sources.
Both features allow you to bring relational data and big data together to provide a uniform platform for data processing. This allows you to use tools such as Machine Learning Language Extensions, AI with Machine Learning Services, Reporting Services, and even SQL Server's primary language T-SQL with data from very differently formatted data structures, without doing any copying or transformation of data.
Enhancements in developer experience
SQL Server 2019 includes several important improvements to the developer experience that help to make developing data-based solutions easier:
- UTF-8 support: Allows you to store Unicode data in the
varchardata types via new collation support. This allows you to store data in the very popular Unicode encoding format natively, not requiring translation into the UTF-16 standard that is used in
nvarcharcolumns. For more information, refer to the Microsoft documentation25 that covers collation and Unicode support in SQL Server. For an in-depth commentary on UTF-8 in SQL Server, refer to the enlightening blog26 by Pedro Lopez.
- Machine Learning Services: Machine Learning Services27 enables R and Python support, allowing T-SQL to employ machine learning models where your data lives. It allows processing data at the partition level, rather than only at the object level, enabling parallel processing. Additionally, Machine Learning Services can now be used with Failover Clusters.
- SQL Server Language Extensions and the Extensibility Framework: Improvements made to allow additional languages to be run in a very similar manner to R and Python, which are used by Machine Learning Services. This gives developers more choices for running established code right in the SQL Server engine. Currently, only Java is supported, but more languages will follow. For more details, check out Microsoft's overview of language extensions28.
- Lightweight query profiling: Improvements to live query plan gathering29 to make it cheaper to get statistics and progress on currently executing SQL statements.
- SQL Graph enhancements: SQL Server graph database capabilities30 allow for the creation of nodes and edges (many-to-many relationships), which are often needed for applications where a more traditional relational schema is too rigid and complex to query. Improvements now allow edge constraints (foreign keys on edges) and query improvements to query for nodes that are multiple edges away from one another.
SQL Server has been a leader in performance for years, with great backward compatibility at the code level, and tremendous leaps in performance in every new version. SQL Server 2019 brings with it even more improvements to performance, many built on enhancements from recent releases. For example, memory-optimized tables can bring tremendous performance improvements when storing data, and in SQL Server 2019 that feature has been leveraged to enhance the metadata for TempDB.
In this section, we will briefly look at some of the performance enhancements in SQL Server 2019:
- Enhancements to Intelligent Query Processing (IQP): IQP31 is a family of loosely connected technologies designed to improve your SQL Server, many without any changes to your code. Several new methods of improving performance, such as expanding batch mode to include rowstore structures and inlining scalar functions, have been added.
- Accelerated Database Recovery (ADR): ADR32 dramatically reduces the time required to return control to the user in a rollback of a large data change/recovery on restart, doing much of the work asynchronously.
- Hybrid Buffer Pool: Hybrid Buffer Pool33 provides support for persistent memory modules (PMEMs), which allows the engine to use PMEMs in the classic roles typically played by RAM and the data file, eliminating the need to checkpoint data from memory, greatly enhancing performance for scenarios where very high performance is required.
- Memory-Optimized TempDB Metadata: The metadata for the TempDB34 database can be altered to use in-memory data structures, removing bottlenecks that occur when rapidly and concurrently creating a large number of temporary tables.
Security is (or at least should be) one of the most important concerns for any data engineer/administrator. SQL Server 2019 introduces several important security improvements, building on improvements in recent versions of SQL Server, such as Always Encrypted35, row-level security36, dynamic data masking37, transparent data encryption38, and far more39 than we will cover in this book.
Changes in SQL Server 2019 to security include:
- Always Encrypted with secure enclaves: An enhanced form of Always Encrypted40 that allows computations/searches to occur on the server side on encrypted string data, but still never shares the plaintext with the user or administrator without the required key.
- Data classification and auditing: This starts with an SQL Server Management Studio (SSMS) tool to help locate, classify, and tag sensitive data that may need to be handled specially. Next, to know when sensitive data is being accessed, SQL Server Audit41 includes a new field in its output that indicates the data sensitivity of data that is included in the audit output. For more details, refer to this Microsoft guide42 on data discovery and classification.
- Simplified certificate management: Certificate management43 is integrated with SQL Server Configuration Manager.
High Availability/Disaster Recovery (HADR)
As your demand for around-the-clock access to data increases, the following features are designed to make usage of SQL Server even more possible during maintenance and during a failover to a different server:
- Index maintenance enhancements: SQL Server 2019 adds to the types of indexes that can be rebuilt online to include clustered columnstore indexes, as well as allowing you to pause and resume rowstore index rebuilds. For more details, check the information44 provided by Microsoft.
- Availability group enhancements: Availability groups45 are an HADR feature that was first introduced in SQL Server 2012. They allow you to maintain copies of your database in a different location to fail over to when there is a failure in your primary database/server (as well as other uses). SQL Server 2019 increases the number of synchronous secondary replicas from three to five. Automatic client redirection46 has been added, so clients can fail over without changing the connection string. Additionally, there have been licensing improvements47 for Software Assurance customers pertaining to HADR scenarios.
Platform of choice
SQL Server has been around for over 20 years on Windows, but in SQL Server 2017, the platform choices grew to include Linux. SQL Server 2017 on Linux included most of the primary features of a relational engine that customers needed, but not all of them. SQL Server 2019 adds most of the features that were missing in SQL Server 2017.
Features added for SQL Server 2019 on Linux include:
- Replication: Data is allowed to be copied automatically between databases on the same or different instance (including Windows instances). Transaction and snapshot replication is now in Linux.
- Distributed Transactions: Enables transactions that extend beyond the confines of the instance.
- Change Data Capture: Maintains a history of changes to data in a database, commonly for processes duplicating data where replication doesn't make sense.
- Extended Active Directory Support: Adds support for third-party Active Directory integrations.
- Machine Learning/Language Extensions: Adds the ability to run R, Python, and Java inside the SQL Server engine.
- PolyBase: Ability to query external data and leverage data virtualization using T-SQL, as described earlier.
For a complete list of improvements to the Linux version of SQL Server with links to more details, check out the information at the Microsoft documentation48.
Beyond Linux on a VM or on-premises server, improvements have been made when installing SQL Server on a container. Azure Container Registry49 provides a location to manage containers for Docker and Open Container Initiative images.
SQL Server is also now available50 on Red Hat Enterprise Linux 8, as well as using Red Hat Universal Image Containers.
Finally, when using containers, SQL Server 2019 does not need to be started as a root container51 in Linux, providing a more secure experience.
SQL Server IaaS scenarios and use cases
With all the choices for how to deploy and implement SQL Server for your organization, both on-premises as well as through the Azure SQL offerings, why choose SQL Server on an Azure VM? While the more managed versions of SQL Server provide a lot of benefits, they have downsides if you are heavily invested in SQL Server on-premises because they can require you to change your infrastructure considerably. (Even the managed instance offering may be too restrictive for many organization's needs.) The IaaS model allows you to use the entire SQL Server 2019 feature set in a way that will work mostly as your on-premises model has for years.
The IaaS model is natural to existing DBAs while reducing (or even eliminating) the need to house and manage server hardware in your organization's premises. There is also licensing value as well, because SQL Server VMs can be adjusted in power and storage relatively easily, as well as licensed to pay as you go, allowing you to start and stop as you desire and incurring far smaller costs when VMs are not running.
In this section, we will look at three use cases for SQL Server 2019 in an Azure VM that will immediately benefit your organization, with the least friction with what your staff already know:
- Lift and Shift: Keeping your applications pretty much as is, moving to Azure.
- Extending On-premises Environment to the Cloud: When you need to add a server to your environment and do not wish to add more hardware to your estate.
- Development and Test Environments: Developing new code and testing it, perhaps in the latest SQL Server version or a different edition (for example, if you are on standard edition and want to see what the effect of using Enterprise Edition would be).
These scenarios are ones that you are likely to use IaaS for. Not for radical change to what you currently do, but rather to make use of the knowledge and skill set of existing database human resources, all while reducing the need to manage and maintain hardware on your premises, either for long-term or short-term utilization.
Lift and Shift
For many organizations, keeping hardware up to date is a daunting task. You purchase a server one year, and it becomes outdated very quickly. Three or four years later, when the hardware is very outdated, you finally have the budget to upgrade the hardware. Then, you can perform the upgrade, including creating servers, moving data around, and so on. All of this is a time-consuming, costly process, even if you are already virtualizing your server resources on-premises. At the same time, moving your environment to a fully managed service, even a managed server, can be complex or even not possible, depending on your requirements. (For example, if you need to use the file system or other executables on the same server, it is not possible in a managed server or SQL Database.)
In a lift-and-shift operation, you get one or more servers that you can configure exactly as your on-premises server were configured. The same drives, directories, services, and so on, even including third-party executables.
What you don't have to worry about is the scalability of your server or how to keep up with the latest hardware. An Azure SQL VM can be sized as needed, to any level of hardware you need, and resized52 if needs change.
Now the burden of managing the hardware lies with the Microsoft Azure management team, leaving your organization with more time to build and test quality software.
Extending your on-premises environment to the cloud
If moving your entire datacenter to Azure is not something your organization is ready for, that doesn't mean that SQL Server on an Azure VM is not useful. There are a few ways you might extend your environment using Azure VMs.
A primary way would be as a disaster recovery site. You can have a reasonably low power server that you copy your data to, perhaps utilizing Always On availability groups53 using a hybrid on-premises and cloud configuration. If the server needs to recover from a disaster, you ramp up the resources, point clients to the Azure VM instead of the local server, and you are ready to go.
Another common use is when you need an extra server to meet a need (possibly even a short-term need), but do not have the inclination to procure and configure a server and permanent SQL Server licenses. Once you have the VPN gateway54 configured, adding a VM-based server in Azure is very much like adding one locally. The main difference is that Microsoft does the heavy lifting of setup for you; you just determine how much power you need (and adjust as necessary). And since SQL Server licensing can be built into charges, if the need is temporary, you do not need to buy a full-price license for SQL Server. Getting rid of a server you no longer need is now simply a matter of clicking delete on the portal (and verifying that you do actually want to delete the server, naturally).
Development and test environments
The last use case we will present is about using Azure SQL VMs for developing and testing new software. Creating a new SQL Server VM can be done in a very short amount of time, and subsequent spin-up time can be further shortened by using templates to configure everything as you need it.
Developers can get a new VM with an SQL Server Developer Edition license, then quickly load executables and test data on it. Once done with testing, they can shut down the server and only pay for storage costs. If you need to start over from scratch, it is easy to delete the server and start the process over with a fresh install.
In the next section, we will look at how we can choose the best OS for your SQL Server VM.
Choosing an OS for SQL Server in Azure VMs
When you create a new SQL Server on an Azure VM, among the first choices you have is which OS to choose. In this section, we will look at some of the thought processes to go through when deciding whether to choose Windows Server or one of the Linux distributions (also referred to as distros) for your SQL Server instance.
SQL Server can run on Windows or Linux, and what might be surprising is that it is the exact same SQL Server engine code base. Of course, Windows and Linux cannot run the same binaries natively, so Microsoft built the SQL Platform Abstraction Layer (SQLPAL), which makes this a reality. For more details on SQLPAL and how it is used, the blog55 by the SQL Team offers interesting insights.
While SQL Server on Windows and Linux have the same code base, there are differences to be understood. In this section, we will look at the reasons for choosing either Windows or Linux, and then discuss the differences between the two implementations.
Reasons to choose either Windows or Linux for SQL Server
This book's focus is on using Microsoft SQL Server on an IaaS platform, and the fact that SQL Server runs on either OS may sound fishy to you. You may be suspicious that they want to get you on Linux and then lure you to Windows over time. It is important to realize that this is not the case: they are both the same SQL Server product and are equally supported.
The primary deciding factor on which OS to use is actually based on the comfort level of the company that will be using it. While there are a few differences in feature sets, there is no compromise in quality or performance with either choice. Much like the decision process in choosing between the IaaS model or the PaaS model, the choice to use Linux for the server OS is more a matter of your organization's needs, rather than being based upon any objective benefit.
If an organization is not comfortable with Windows, it is going to be harder to implement SQL Server on Windows, even using a pre-built image. If you are comfortable with Linux, the process to install and configure SQL Server is more like what you are used to when installing Linux software. As we will see in the following section, while the functionality to the user is going to be almost exactly the same on either OS, the differences in how you configure SQL Server are significant.
One thing to note is that Open-source Software (OSS) developers often prefer Linux because they can install and run additional OSS packages in the same environment with their database server as part of a solution. These packages may or may not run on the Windows platform. SQL Server on Linux lets them reap the benefits of SQL Server's mature data storage engine to build applications that run faster, are more secure, have tight integration with machine learning, and so on.
The goal of SQL Server on Linux is not to kill the Windows version, but to broaden the audience, making the product speak the implementation language of a new audience while providing the same SQL Server T-SQL language to all.
The only reason that you may not be able to feasibly choose Linux for your OS is that some features are not yet supported (for example, Merge Replication, or having multiple instances on a single server), some that will likely never be (such as
FileTable, both of which interface tightly with the Windows OS), and others that will require new binaries to be created (or another abstraction layer), such as Analysis Services. Any need for these features, even something like Reporting Services, which runs as a separate executable, would currently require an entire extra license for SQL Server to run on a separate Windows Server, which could be cost-prohibitive.
The final issue to bear in mind when choosing an OS for SQL Server is cost. Cost is a very complicated discussion because costs come from many different places, some more obvious than others. There are obvious cost differences you can compare empirically, such as the hourly rate of running a VM on Windows versus one with Linux. However, if you must hire new staff and train them on how Windows works, or pay consultants every time something isn't working, the costs may be prohibitive. The same can be said about Linux.
In the end, the most compelling reasons to choose between Linux and Windows are to do with your comfort level with each OS and whether you need specific features that may not be available on Linux (a list that we noted earlier shrank considerably with SQL Server 2019.)
Differences between SQL Server on Linux and Windows
Beyond the obvious deep differences in how the different OSes behave (even the different versions of Linux have their own ways of doing things), there are some differences between the Windows and Linux versions of the product.
In this section, we are going to highlight some of the differences between SQL Server on Linux and Windows, whether you use a pre-built SQL Server VM image from Azure or install SQL Server on your own on-premises computers. The following table contains a list of key differences between SQL Server on Windows and Linux:
Figure 1.2: Differences between SQL Server on Linux and Windows
A few of these differences warrant a little bit of discussion; most importantly, we'll consider the way you install and configure SQL Server and some feature-set limitations of SQL Server on Linux.
While the experience of the typical user employing T-SQL or using an application will be very nearly 100% the same, there are some major differences between SQL Server 2019 on Linux compared to running it on Windows. Obviously, the biggest difference is that in Windows, most SQL Server DBAs will be used to working with a GUI rather than the CLI. Hence, changing from Windows to Linux can be very a large paradigm shift, particularly during installation but also even when choosing where to locate database files. However, if you are used to Linux, the method of server and instance installation/configuration should generally be obvious to you; the same goes for how the file system works.
While it is true that there is a version of Windows that you manage mainly from the command line (Windows Core) that SQL Server can execute on, it is not typically used because Windows administrators are generally used to managing the server via the GUI (in the same manner as their Windows and even Macintosh computers that they regularly use). Even then, however, the commands to install on either platform differ in that the Core SQL Server installation is one executable with many parameters, rather than requiring multiple commands to install different features.
When you install the Linux image with SQL Server the first time, it will be necessary to access the server via what is basically a command terminal. Features are added and configured individually using command-line tools instead of
setup.exe. Once you have the server installed, you can access the SQL Server instance on the Linux computer using SSMS, Azure Data Studio, SQL Server Configuration Manager, and the other GUI tools on your Windows computer. If you do wish to run tools on a Linux GUI, SSMS will not work, but Azure Data Studio56 will.
Probably the most compelling reason for choosing Windows over Linux is if you need some of the external services that are not on Linux. While the engine is the same, there are several services that have their own binaries and are not a part of the core SQL Server engine. For SQL Server 2019, this list includes:
- Reporting Services
- Analysis Services
- Data Quality Services
- Master Data Services
You would need another SQL Server license to run these services on a different computer, which might increase your costs greatly. None of these services are needed for a typical OLTP database, but if you are looking to implement a business intelligence (BI) solution together with SQL Server on Linux in Azure VMs, cloud-based options to augment your VM include Power BI and Azure Analysis Services.
In this chapter, we looked at SQL Server 2019 and the various Azure SQL offerings that are available to deploy your data. There are multiple models available, from a very managed PaaS service such as Azure SQL Database, to the manual-lite managed instance, to the manual and customizable IaaS-based Azure SQL VM. The rest of this book will focus on SQL Server on an Azure VM, whether on Windows or Linux.
SQL Server 2019 on Linux provides customers with a choice of platform, allowing them to tailor SQL Server to the needs of their personnel where much of their software is run on the Linux open-source OS. SQL Server is not itself open source, but Microsoft is embracing the OSS community like never before.
In the next chapter, we will look at the different options for deploying a VM on Azure for SQL Server. Choices such as series of VM, storage types, and installation will be covered, taking into consideration the different workload types that may need to be supported.