ETL with Azure Cookbook

By Christian Cote , Matija Lah , Madina Saitakhmetova
  • Instant online access to over 8,000+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Chapter 1: Getting Started with Azure and SSIS 2019

About this book

ETL is one of the most common and tedious procedures for moving and processing data from one database to another. With the help of this book, you will be able to speed up the process by designing effective ETL solutions using the Azure services available for handling and transforming any data to suit your requirements.

With this cookbook, you’ll become well versed in all the features of SQL Server Integration Services (SSIS) to perform data migration and ETL tasks that integrate with Azure. You’ll learn how to transform data in Azure and understand how legacy systems perform ETL on-premises using SSIS. Later chapters will get you up to speed with connecting and retrieving data from SQL Server 2019 Big Data Clusters, and even show you how to extend and customize the SSIS toolbox using custom-developed tasks and transforms. This ETL book also contains practical recipes for moving and transforming data with Azure services, such as Data Factory and Azure Databricks, and lets you explore various options for migrating SSIS packages to Azure. Toward the end, you’ll find out how to profile data in the cloud and automate service creation with Business Intelligence Markup Language (BIML).

By the end of this book, you’ll have developed the skills you need to create and automate ETL solutions on-premises as well as in Azure.

Publication date:
September 2020
Publisher
Packt
Pages
446
ISBN
9781800203310

 

Chapter 1: Getting Started with Azure and SSIS 2019

In this cookbook, we are going to cover Extract, Transform, Load (ETL) development with Microsoft Azure. We will start with Microsoft SQL Server 2019 Integration Services (SSIS) and then gradually move on to data management capabilities in Azure. Throughout this cookbook we will use hands-on examples, which will not only provide you with genuine first-hand experience in SSIS development but also demonstrate how to design a working ETL solution or a module that could be built into an ETL solution.

Let's start by establishing your development environment. We are going to guide you through the installation of a new SQL Server instance, including all the components needed for on-premises as well as cloud-based SSIS solution development. You are also going to install the tools used in administering and maintaining SQL Server instances – either on-premises or in the cloud. You are also going to install the tools used in developing, testing, and deploying your SSIS solutions. You will finish this chapter by creating an Azure subscription, which will allow your on-premises development environment to extend into the cloud.

This chapter covers the following recipes that will help you get started with SSIS development on-premises as well as in the cloud:

  • Installing Microsoft SQL Server 2019 Integration Services
  • Installing SQL Server Management Studio
  • Installing SQL Server Data Tools
  • Installing Azure Data Studio
  • Creating an Azure subscription
 

Technical requirements

In order to install the necessary components, you will need a workstation – either a physical personal computer or a virtual one – running a compatible Microsoft Windows operating system. SQL Server 2019 can be installed on any edition of the Microsoft Windows Server operating system, starting with Windows Server 2016, or even on its desktop edition, Windows 10, starting with version TH1, build 1507.

Depending on the features installed, SQL Server is going to require between 6 and roughly 8 GB of free space on the hard drive, and at least 1 GB of available system memory, though around 4 GB is the recommended minimum. Practically any 64-bit CPU is supported.

Important note

You can find more detailed information about SQL Server 2019 hardware and software requirements in the online vendor documentation (SQL docs) article entitled SQL Server 2019: Hardware and software requirements, at https://docs.microsoft.com/en-us/sql/sql-server/install/hardware-and-software-requirements-for-installing-sql-server-ver15.

Speaking of online documentation, your workstation is going to need internet access – even to perform the installations.

 

Installing Microsoft SQL Server 2019 Integration Services

You are going to install a new instance of SQL Server on your workstation; if you prefer, you can also upgrade an existing SQL Server instance, as noted in the recipe. The installation will include all the features necessary to design SSIS solutions generally, not just to perform the work presented in this book.

The Client Tools Software Development Kit (SDK), an otherwise optional SQL Server component, must be installed on the workstation used in SSIS development. It contains the .NET managed assemblies with design-time access to the SSIS runtime, without which SSIS development simply cannot take place.

The installation of the SQL Client Connectivity SDK, another optional component, is recommended, though not necessary. It contains SQL Server Native Client connectivity resources that you might need in database application development.

Important note

None of the SDKs mentioned would ever need to be installed on a server used for hosting data, as such a server would normally not be used for SSIS development.

Getting ready

You need access to the SQL Server 2019 installation media. Since this is going to be a typical development workstation installation, you can use the free, specialized SQL Server 2019 Developer Edition, available for download at https://www.microsoft.com/en-us/sql-server/sql-server-downloads. Alternatively, you can also use a free trial edition of SQL Server 2019 Standard, or the Data Center editions, but please note that their use is limited to 180 days.

Important note

The Developer edition of SQL Server 2019 is a full-featured free edition, licensed for use as a development and test database in a non-production environment.

Either download the installation media from the website or use the installation media provided by the administrator in your organization. If you decide to use the Developer edition of SQL Server 2019, download the executable from the website, and save it to the local drive on the workstation that you will be using for the recipes in this book. If you prefer to use the Evaluation edition of SQL Server 2019, you will have to sign up on the website by providing some of your personal information before being able to access the installation media.

How to do it…

Start your workstation and log in with an account with administrative operating system privileges. Make sure the workstation has access to the SQL Server 2019 installation media – for instance, make sure that the Developer edition executable file is available on the local drive:

  1. Locate the installation media, and double-click the installation executable (for instance, SQL2019-SSEI-Dev.exe for SQL Server 2019 Developer Edition) to start the installation.

    When prompted by the operating system, which will ask whether you want to allow the application to make changes to your device, as shown in the following screenshot, click Yes to continue:

    Figure 1.1 – The SQL Server installation User Account Control dialog

    Figure 1.1 – The SQL Server installation User Account Control dialog

  2. At the beginning of the installation, you are asked whether you would like to use the basic settings, customize them, or simply download the rest of the installation media.

    Tip

    Generally, you will only download the installation media if you plan to install SQL Server on additional devices. On the other hand, a very good reason for having the media available locally would be in case you need to repair the installation later.

    Select Custom to continue.

  3. You now have to select the location of the installation files. Use the Default folder unless you prefer to use an alternative location.

    When ready, click Install to continue. Depending on your internet connection, it should take just a few minutes to download and extract the files.

  4. After the installation files have been prepared, the SQL Server Installation Center window will open, as shown in the following screenshot:
    Figure 1.2 – SQL Server Installation Center

    Figure 1.2 – SQL Server Installation Center

    On the Installation page, select the New SQL Server stand-alone installation or add features to an existing installation option to start the setup wizard.

  5. On the Product Key page, either specify your free edition (Developer or Evaluation) or provide the product key of a licensed edition that you are allowed to install.

    Important note

    Do not install the Express edition of SQL Server 2019, as it does not include SSIS, nor does it come with many of the other features that you will need for the recipes in this cookbook.

  6. Specify the appropriate edition, as shown in the following screenshot:
    Figure 1.3 – Specifying the SQL Server edition

    Figure 1.3 – Specifying the SQL Server edition

    Click Next to continue.

  7. On the License Terms page, read and accept the license terms, and then click Next to continue.
  8. On the Microsoft Update page, shown in the following screenshot, check Use Microsoft Update to check for updates to include the latest updates for the product, unless you prefer to install them later:
    Figure 1.4 – Adding available updates to the installation

    Figure 1.4 – Adding available updates to the installation

    Click Next to continue.

    The setup program should now perform a few checks of your system to verify whether the installation can proceed.

  9. On the Install Rules page, you should see the system validation results.

    If there are errors, click on the link in the Status column to access each error message. Depending on the error, the installation might not proceed until you correct the cause or might even have to be aborted and restarted after the problem has been resolved.

    If there are warnings, the installation should allow you to continue; however, you should inspect the warning messages anyway, as shown in the following screenshot, as additional activities might be needed during or after the installation in order for the SQL Server instance or one or more of the shared features to work as expected:

    Figure 1.5 – Install Rules

    Figure 1.5 – Install Rules

    Tip

    If SQL Server has not been installed on this workstation before or has not yet been configured for external access, you might see the Windows Firewall warning, as shown in the screenshot. In this particular case, the warning points to information on configuring the Windows firewall in order to allow external access to the SQL Server instance.

    You do not have to make any changes to your workstation's firewall at this time.

    When ready, click Next to continue.

    If the installation detects an existing SQL Server instance, an additional step is added, shown in the following screenshot, asking you to either create a new SQL Server instance or add features to an existing one:

    Figure 1.6 – Installation Type

    Figure 1.6 – Installation Type

    This cookbook assumes that you will be using a workstation where SQL Server has not been installed before, but you can also use an existing SQL Server instance if you prefer.

  10. On the Feature Selection page, select the following features to be installed:

    a) Database Engine Services

    b) PolyBase Query Service for External Data

    c) Java connector for HDFS data sources

    d) Client Tools Connectivity

    e) Integration Services

    f) Client Tools SDK

    g) SQL Client Connectivity SDK

    You can see part of the selection in the following screenshot:

    Figure 1.7 – SQL Server 2019 Feature Selection

    Figure 1.7 – SQL Server 2019 Feature Selection

    When ready, click Next to continue.

  11. If you are installing this SQL Server instance on a workstation where SQL Server has not been installed before, you can create either a new default instance or a named instance. Otherwise, your options will depend on what parts of SQL Server are already installed on the system:
    Figure 1.8 – SQL Server 2019 Instance Configuration

    Figure 1.8 – SQL Server 2019 Instance Configuration

    If possible, use the default instance, as the preceding screenshot shows; however, if you decide on using a named instance instead, we recommend that you use MSSQL2019 as its name. Throughout this book, we will refer to this SQL Server instance either as localhost (the default instance) or MSSQL2019 – in both cases, this will mean the same SQL Server instance.

    When ready, click Next to continue.

  12. On the PolyBase Configuration page, leave the default settings unchanged, as shown in the following screenshot, and click Next to continue:
    Figure 1.9 – PolyBase Configuration

    Figure 1.9 – PolyBase Configuration

  13. On the Java Install Location page, leave the default option selected to install the version of the Java runtime environment included in the installation, as shown in the following screenshot, unless you are installing to an environment where a different version is required by your administrators:
    Figure 1.10 – Java Install Location

    Figure 1.10 – Java Install Location

    If you have to install a different version, select the second option, and use Browse to locate the installation files.

    When ready, click Next to continue.

  14. On the Server Configuration page, on the Service Accounts tab, change the Startup Type setting of the SQL Server Agent feature to Automatic.

    Leave all other settings on this tab unchanged, as displayed in the screenshot that follows:

    Figure 1.11 – SQL Server service accounts

    Figure 1.11 – SQL Server service accounts

  15. On the Collation tab of the same page, you can leave the default collation for the database engine; however, it is recommended to use a Windows collation instead.

    To change the collation, click Customize next to the Database Engine Collation setting, and then in the Customize the SQL Server 2019 Database Engine Collation dialog, select the Windows collation designator and sort order option with the following settings:

    a) Collation designator: Latin1_General_100

    b) Accent-sensitive: Checked

    c) Char/Varchar Storage Options: UTF-8

    The recommended settings are shown in the following screenshot:

    Figure 1.12 – Setting the default collation for the SQL Server instance

    Figure 1.12 – Setting the default collation for the SQL Server instance

    Click OK to confirm the settings.

    Review the settings on the Collation tab; the full name of the collation should be Latin1_General_100_CI_AS_SC_UTF8, as shown in the following screenshot:

    Figure 1.13 – The default SQL Server collation

    Figure 1.13 – The default SQL Server collation

    When ready, click Next to continue.

  16. On the Database Engine Configuration page, on the Server Configuration tab, shown in the following screenshot, leave Windows authentication mode selected for Authentication Mode:
    Figure 1.14 – Database Engine Configuration

    Figure 1.14 – Database Engine Configuration

    Add the account you are currently using to the SQL Server administrators role by clicking the Add Current User button.

    After a few moments, your username should appear in the SQL Server administrators list box.

    You do not have to make any changes to the rest of the settings on this page unless you want to change the location of the database files, which you can do on the Data Directories tab.

    Important note

    For the purposes of this cookbook, you can use the default locations on the C: drive for the database files; however, in a real-life environment – even if solely for testing purposes – it is recommended to host the database files on a different drive, not the one used for the operating system and other installed features files.

    When ready, click Next to continue.

  17. On the Ready to Install page, review the installation options listed in Steps 5 through 16.

    When you are sure that everything has been configured in accordance with the instructions in this recipe, click Install to start the automated part of the installation. On modern hardware, the installation should complete within a few minutes.

  18. Once the installation has completed, inspect the installation status of each installed feature. If any errors are encountered during the installation, they will be marked with a Failed installation status. When you select each feature reported as failed, additional information about the failure should be displayed in the Details section:
    Figure 1.15 – The final step of the SQL Server 2019 installation

    Figure 1.15 – The final step of the SQL Server 2019 installation

    Inspect any error messages and address the cause of each problem based on the information provided. You can repeat the installation after you have resolved the problems.

    Click Close to complete the installation.

    One more task awaits you, and then the installation can be considered finished (for the time being, at least). In order to allow the PolyBase services access to the SQL Server instance, you must allow connections to the instance through the TCP/IP protocol.

  19. In the Windows Start menu, in the Microsoft SQL Server 2019 folder, locate SQL Server 2019 Configuration Manager and start it:
    Figure 1.16 – SQL Server Configuration Manager

    Figure 1.16 – SQL Server Configuration Manager

    In Configuration Manager, shown in the preceding screenshot, under SQL Server Services, you should see that the SQL Server instance that you just installed is running; however, both SQL Server PolyBase services will either be Stopped or in the Change pending… state.

    The reason for this is that the services are unable to connect to the SQL Server instance. They use TCP/IP to connect, and on a newly installed SQL Server instance, this protocol is not enabled.

  20. Under SQL Server Network Configuration | Protocols for MSSQLSERVER, shown in the following screenshot, right-click the disabled TCP/IP protocol, and select Enable from the shortcut menu:
    Figure 1.17 – Enabling the TCP/IP protocol

    Figure 1.17 – Enabling the TCP/IP protocol

    A warning is displayed, telling you that in order for the changes to take effect, the affected service must be restarted. Click OK to close the warning.

  21. Back under SQL Server Services, right-click the SQL Server (MSSQLSERVER) service, and select Restart from the shortcut menu:
    Figure 1.18 – Restarting the SQL Server instance

    Figure 1.18 – Restarting the SQL Server instance

  22. Wait for the service to restart. When prompted to stop or restart any other dependent service, confirm those as well. After the SQL Server instance has restarted, the PolyBase services should start as well. However, if they stay in the Change Pending… state for an extended period of time, try starting them manually. If even that fails, you might have to restart the workstation.

Close SQL Server 2019 Configuration Manager.

How it works…

You have just successfully installed a new SQL Server instance. You will use it to complete the rest of the recipes in this cookbook; of course, it can also be used for other development work involving the SQL Server platform. If you later determine that you need additional features, you can add them by starting the installation again from SQL Server Installation Center.

Important note

If you have installed the Developer edition of the product, please remember that it has not been licensed for production use. You are only allowed to use it for application development and testing.

If you have installed an Evaluation edition of the product, remember that the license expires 180 days after the installation.

 

Installing SQL Server Management Studio

SQL Server Management Studio (or SSMS, for short) is the principal tool used in SQL Server administration, maintenance, and development. It can be used to administer relational databases hosted on the Database Engine, or analytical databases and solutions hosted in Analysis Services. It can be used to administer the SQL Server Integration Services and Database (SSISDB) catalog, used for SSIS project deployments. You can even use it to administer the legacy SSIS service, used for legacy SSIS package deployments.

Important note

While most SSIS development activities will take place in SQL Server Data Tools (SSDT), you should still install SSMS on the workstation. Its programming capabilities complement SSDT, whereas its administrative and maintenance capabilities far exceed those available in SSDT.

Of course, SSMS supports numerous programming languages and data management standards typically associated with data management: Transact-SQL (T-SQL, for short) to manipulate data in relational databases, Multidimensional Expressions (MDX) and Data Mining Extensions (DMX) for data analysis and knowledge extraction, and XML for Analysis (XMLA) to administer SQL Server's Analysis Services (SSAS) databases. You can even use SSMS to create and edit XML documents.

Getting ready

For several years now, SSMS has been delivered as a standalone tool, no longer closely associated with a particular version of SQL Server. This decision allows Microsoft to develop the tool in its own development cycle, pretty much completely independently of the development cycle they use for the SQL Server platform. It also grants the development team behind the tool more flexibility in terms of the features being deployed or upgraded in each version.

SSMS is, therefore, no longer included as part of the SQL Server installation, as it used to be in the past, and must be installed separately. It is available for download at https://docs.microsoft.com/en-us/sql/ssms/download-sql-server-management-studio-ssms.

The download site can also be accessed from SQL Server Installation Center.

Tip

It is recommended to always use the latest version of SSMS, as it typically contains the latest security, stability, and user-experience fixes. The latest features and capabilities are, naturally, also only available in the most recent version of SSMS.

In fact, this is true for every tool that you will be installing in this and later chapters.

How to do it…

Use your favorite internet browser to navigate to the SSMS download site listed previously, or start SQL Server Installation Center – it is located in the Start menu, in the Microsoft SQL Server 2019 folder – and then, on the Installation page, select the Install SQL Server Management Tools option.

You will start this recipe by installing SSMS and complete it by downloading a sample database from GitHub and deploying it to the SQL Server instance you installed in the previous recipe, Installing Microsoft SQL Server 2019 Integration Services:

  1. In the Microsoft docs article entitled Download SQL Server Management Studio (SSMS), inspect the latest information about the download. Pay close attention to any warnings or considerations that the vendor might have provided.

    When ready, click the Download SQL Server Management Studio (SSMS) HTML link to initiate the download.

  2. If prompted to save the executable to your drive, click Save, wait for the download to complete, and then click Run to start the installation.

    If the download is performed in the background, use Windows Explorer to locate the file in your downloads folder, and start the execution from there – for instance, by executing the SSMS-Setup-ENU.exe file.

  3. On the first screen of the installation dialog, leave the default location unchanged, and click Install to continue.
  4. The installation should complete within a few minutes.

    If the installation is interrupted by an error, investigate the cause based on the information provided in the installation dialog. Otherwise, click Close to finish the installation.

  5. In the Windows Start menu, locate the Microsoft SQL Server Tools 18 folder, and in it the Microsoft SQL Server Management Studio 18 shortcut. Open it.
  6. In the Connect to Server dialog, select the SQL Server instance you created in the previous recipe, Installing Microsoft SQL Server 2019 Integration Services.

    The instance should now be displayed in the SSMS Object Explorer.

  7. Open SQL Server Installation Center, and on the Resources page, locate the SQL Samples Web Site link. Follow the link to navigate to the Microsoft SQL Server Samples site.

    In your internet browser, scroll down to the SQL Server Samples Repository section, and click on the link to the Wide World Importers sample database web page.

  8. On the new page, locate the WideWorldImporters-Full.bak file and click the link to start the download.

    Depending on your internet connection, the download should complete within a few minutes.

  9. In SSMS, in Object Explorer, right-click the Databases node, and then select Restore database… from the shortcut menu, as shown in the following screenshot:
    Figure 1.19 – Restoring a SQL Server database

    Figure 1.19 – Restoring a SQL Server database

  10. In the Restore Database dialog, on the General page, select Device as the source, and open the Select backup devices dialog by clicking the ellipsis icon to the right of the Device text box, as shown in the following screenshot:
    Figure 1.20 – Selecting the backup device

    Figure 1.20 – Selecting the backup device

  11. In the Select backup devices dialog, click Add to open the Locate Backup File dialog, and then navigate the filesystem on the left to locate the file you downloaded in Step 8.

    Select the file and click OK to confirm the selection.

  12. In the Select backup devices dialog, click OK to confirm the selection.
  13. In the Restore Database dialog, click OK again to start the restore operation.

    It should take about a minute to restore the sample database.

    Once the message box opens with the Database 'WideWorldImporters' restored successfully message, the database should be available on the selected instance.

    If there are any errors, inspect the error messages, make the necessary corrections, and repeat the process accordingly.

  14. After the database has been restored successfully, click OK to close the message box, which also closes the Restore Database dialog.

    Now that you have your first tool installed, you can change two more settings that you might need later.

  15. In SSMS, open a new query window, connect to the master database, and then use the following T-SQL command to inspect the Hadoop connectivity instance settings:
    EXEC sp_configure
         @configname = 'hadoop connectivity';

    In the execution result, observe the run_value column; if its value is 0, change it to 7 by executing the following T-SQL commands:

    EXEC sp_configure
         @configname = 'hadoop connectivity',
         @configvalue = 7;
    reconfigure;

    By turning this configuration setting on, you allow your SQL Server instance to connect to remote Hadoop or Azure Blob storage locations. By using the configuration value of 7, you allow connections to all versions and both editions of Hortonworks, as well as to the Azure Blob storage.

    Tip

    More information about this setting is available in the vendor documentation, at https://docs.microsoft.com/en-us/sql/database-engine/configure-windows/polybase-connectivity-configuration-transact-sql. You should be familiar with the setting and its possible values before using the feature in a production environment.

  16. Use the following T-SQL command to check whether the PolyBase service is enabled on your SQL Server instance:
    EXEC sp_configure
         @configname = 'polybase enabled';

    If the value of the run_value column is 0, that means that PolyBase is not enabled; enable it by changing the configuration value to 1 by executing the following T-SQL commands:

    EXEC sp_configure
         @configname = 'polybase enabled',
         @configvalue = 1;
    RECONFIGURE;

    Tip

    More information about this setting, and the PolyBase services in general, is available in the vendor documentation, at https://docs.microsoft.com/en-us/sql/relational-databases/polybase/polybase-installation. You should be familiar with the setting and its possible states before using the feature in a production environment.

  17. If all the commands complete successfully, close SSMS. This time, you do not have to save the script or the solution.
  18. Open SQL Server 2019 Configuration Manager, and restart the MSSQLSERVER instance, as you did in Step 20 of the previous recipe, Installing Microsoft SQL Server 2019 Integration Services.

    Important note

    The SQL Server instance must be restarted in order for these two configuration settings to take effect, as they affect the dependent PolyBase services. Also, you cannot restart a SQL Server instance running PolyBase from within SSMS when Object Explorer is connected to the service and prevents the service from being stopped.

After the instance is restarted, you can close Configuration Manager.

How it works…

You have installed the principal development and administration tool for SQL Server. You will be using SSMS throughout this cookbook. Of course, if you have worked with SQL Server before, you should already be familiar with SSMS and its capabilities. If this is the first time you have used SSMS, do not worry – every feature used in the recipes in this book will be explained when needed.

You finished the recipe by deploying a sample database to the newly installed SQL Server instance. You will use this database in other recipes of this cookbook, and you are free to use it whenever you are exploring SQL Server and its capabilities.

 

Installing SQL Server Data Tools

SSDT is a specialized edition of the Microsoft Visual Studio integrated development environment (IDE). It can be used to develop a variety of projects related to SQL Server, such as relational and Azure databases, Analysis Services data models, Reporting Services reports, or Integration Services projects, to name just a few.

In the past, SSDT used to be a standalone tool, requiring only a subset of Visual Studio features to be installed. However, starting with Visual Studio 2019 (or SQL Server 2019), SSDT represents an extension (or better: a set of extensions) of the principal Visual Studio IDE.

To use the 2019 version of SSDT, you first need to install Visual Studio 2019, and then add the SSDT extensions, followed by the extensions needed to develop Analysis Services, Integration Services, and/or Reporting Services projects. This does increase the complexity of the installation procedure, but it also more closely follows the standard approach of adding features to the core Visual Studio installation.

Getting ready

If you do not have a valid Visual Studio 2019 Professional or Enterprise license at your disposal, you can also use the free Community edition of Visual Studio 2019. All editions are available to download at https://docs.microsoft.com/en-us/sql/ssdt/download-sql-server-data-tools-ssdt. This location can also be accessed from SQL Server Installation Center.

How to do it…

Use your favorite internet browser to navigate to the Visual Studio 2019 download site listed previously, or start SQL Server Installation Center, and then on the Installation page, select the Install SQL Server Data Tools option.

This recipe assumes that Visual Studio 2019 has not yet been installed on the workstation, and the instructions will show you how to install Visual Studio 2019 Community Edition:

  1. On the Download SQL Server Data Tools (SSDT) for Visual Studio web page, in the Install SSDT with Visual Studio 2019 section, locate the link to Visual Studio 2019 Community, and follow that link to the Visual Studio 2019 Downloads page. Click FREE DOWNLOAD in the Community column to start the download operation and save the file to the local drive on your workstation.
  2. When prompted, click Run to start the installation; otherwise, use Windows Explorer to navigate to your folder, and start the installation by opening the vs_Community.exe file.
  3. The installation begins with the setup of a few prerequisites. You can take this moment to inspect the Microsoft Privacy Statement and the Microsoft Software License Terms by following the links provided in the message box:
    Figure 1.21 – Visual Studio Installer

    Figure 1.21 – Visual Studio Installer

    When ready, click Continue.

    The installation now downloads and extracts the files, after which the Installing — Visual Studio 2019 Community dialog should open.

  4. On the Workloads page, in the Desktop & Mobile section, check .NET desktop development, and in the Other Toolsets section, check Data storage and processing:
    Figure 1.22 – The Visual Studio 2019 installation details

    Figure 1.22 – The Visual Studio 2019 installation details

    Check the list of items in the right column – the same items should be listed as the ones shown in the screenshot.

  5. Leave the Install while downloading option selected in the combo box next to the Install button.

    When ready, click Install to continue.

    The installation should take a few minutes, depending on your internet connection, after which the Visual Studio will greet you with the Sign in screen.

  6. Click Not now, maybe later to skip the sign-in process for the time being.
  7. In the next dialog, select your development settings and your color theme, and click Start Visual Studio.

    Of course, you can also change these settings in SSDT later.

  8. After a few minutes, Visual Studio 2019 should open, with the Open recent/Get Started dialog.

    At this time, you only need to access the Visual Studio 2019 IDE to install additional SQL Server extensions.

    Click Continue without code.

  9. Visual Studio opens without any solutions or projects loaded.

    In the Extensions menu, select the Manage Extensions command.

  10. In the Manage Extensions dialog, on the Online | Visual Studio Marketplace page, use the search text box located at the top right to search for integration services, as shown in the following screenshot:
    Figure 1.23 – The Visual Studio 2019 extensions

    Figure 1.23 – The Visual Studio 2019 extensions

    In the search result list, locate SQL Server Integration Services Projects and click Download.

  11. An internet browser window should open, prompting you to save the file to your local drive. Save the file to the local drive on your workstation.
  12. If prompted by the browser to run the file, click Run to start the installation; otherwise, use Windows Explorer to navigate to your downloads folder, and open the Microsoft.DataTools.IntegrationServices.exe file to start the installation.
  13. In the first dialog, select the language to be used during the installation. You can use any of the supported languages; however, note that these instructions assume that you selected the English language. When ready, click OK to continue.
  14. In the next dialog, simply click Next to continue.
  15. In the next dialog, make sure that your edition of Visual Studio 2019 is selected as the installation target, but do not click Install yet.
  16. Switch to Visual Studio 2019, close the Manage Extensions dialog, and then close Visual Studio as well.

    Important note

    During the installation of a Visual Studio extension, changes might have to be made to Visual Studio files, which cannot be done if these files are open. Therefore, this is the perfect moment during the installation to close Visual Studio. If files are open, an error message might be displayed, prompting you to close them before being able to continue.

  17. Switch back to the Visual Studio extensions installer, and now do click Install to continue.
  18. If prompted by the operating system whether to allow SQL Server Integration Services Projects to make changes to your device, click Yes to continue. The installation should complete within a few minutes.
  19. In the final dialog, click Close to finish the installation.

Close the internet browser window that was used to download the SQL Server Integration Services Projects installer, and Visual Studio Installer, if they are still open.

How it works…

You have installed Visual Studio 2019 with the features needed to complete the rest of the recipes in this book. These recipes will cover typical SSIS development, but also control flow and data flow customization.

You can use the same setup to complete many other development tasks, and if needed, you can install any additional features at any time, by running Visual Studio Installer – available from the Windows Start menu.

Important note

In order to continue using the Community edition of Visual Studio 2019 beyond the 30-day evaluation period, you will have to sign in with a Microsoft Live account. You can use the same account that you will be using later in this chapter to create the Azure subscription.

A few days before the evaluation period expires, if you try to open SSDT, you will be prompted to sign in, but you can also sign in at any time.

 

Installing Azure Data Studio

Azure Data Studio (ADS) represents a new database administration, maintenance, and development tool to be used with Microsoft data platforms on-premises or in the cloud. In contrast to SSMS, not only can ADS be installed on Windows, but also on macOS or Linux.

ADS offers a new approach to development with a visually and functionally rich T-SQL editor, IntelliSense, and code snippets. It visually integrates execution results with the T-SQL queries that were executed to generate them and allows you to export query results in several standard formats.

The aim of ADS is not to completely replace SSMS as the principal data administration, maintenance, and development tool, but rather to complement it, especially as a cross-platform tool.

If you are new to ADS, or need help in determining how to balance your work between ADS and SSMS, we recommend the What is Azure Data Studio? introductory article, available online at https://docs.microsoft.com/en-us/sql/azure-data-studio/what-is.

Getting ready

The ADS installation is available online, at https://docs.microsoft.com/en-us/sql/azure-data-studio/download-azure-data-studio.

How to do it…

Use your favorite internet browser to navigate to the ADS download site listed previously:

  1. On the Download and install Azure Data Studio web page, locate the recommended User Installer link, and follow it to start the download. When prompted, save the file to the local drive on your workstation.

    Depending on your internet connection, the download should complete within a minute.

  2. When prompted by the browser, click Run to start the installation; otherwise, use Windows Explorer to navigate to your downloads folder, and initiate the installation there.
  3. Review the license agreement, and then click Next to continue.
  4. On the Select Destination Location page, leave the default folder selected, and click Next to continue.
  5. On the Select Start Menu Folder page, leave the default folder name, and click Next to continue.
  6. On the Select Additional Tasks page, you can check the Create a desktop icon option if you want to place the ADS shortcut on the desktop. You can also check the Register Azure Data Studio as an editor for supported file types option if you want all files normally associated with SSMS to be opened in ADS by default:
    Figure 1.24 – Selecting additional tasks in the ADS setup

    Figure 1.24 – Selecting additional tasks in the ADS setup

    If you are not planning to replace SSMS with ADS at this time, leave the default settings unchanged, as shown in the screenshot.

    When ready, click Next to continue.

  7. On the Ready to Install page, review the setup options. When ready, click Next to start the automated part of the installation. The installation should complete within a few minutes.
  8. On the final page, uncheck the Launch Azure Data Studio option, and click Finish to close the installer.

You do not have to start ADS at this time.

How it works…

You have now installed ADS, the newest addition to the Microsoft data platform development and administration toolset. Some of the recipes in this cookbook will require the use of features that are available in ADS, but not in SSMS.

 

Creating an Azure subscription

One of the objectives of this cookbook is to show you how to integrate your on-premises data warehousing solutions with cloud-based data storage, maintenance, and consumption capabilities of the Microsoft data platform. One of the prerequisites needed to follow this objective is an active Azure subscription.

Azure provides a variety of services and features that you can use to host your data off-premises, extend your own on-premises and cloud-based data management solutions with cloud-based services provided by Microsoft and Microsoft Partner Network, and even host your own solutions as cloud-based services to be used by your clients.

If you already have access to a valid Azure subscription, you are welcome to use it for all the relevant recipes in this book; otherwise, you can use this recipe to create a new one.

An Azure free subscription will provide you with access to the services for 12 months – free of charge. A number of services are even provided at no cost independently of the subscription period.

The free subscription includes a credit of 200 USD to be used for paid services for a period of 30 days from the subscription activation. This period should be enough for you to run all of the cloud-oriented recipes provided in this cookbook.

Important note

Certain paid services in Azure incur costs while online, even if they are not actually being used. This will be pointed out in each recipe dealing with such services, and the instructions on how to either turn a feature off or remove it completely will be provided.

However, by using Azure, you take full responsibility for all the costs involved in using the services provided therein.

Getting ready

Azure is an online service, located at https://portal.azure.com/. Generally, you would use an internet browser to access it; however, on the latest versions of the Windows operating system, you can also use the Azure portal application. The instructions in this cookbook use an internet browser to access the Azure portal.

How to do it…

Use your favorite internet browser and navigate to the Azure portal website:

  1. On the Azure Portal app landing page, click Continue to Azure Portal website.
  2. On the Sign in page, do not sign in with your existing Microsoft Azure credentials, but instead, click Create one! to create a new account.

    On the next screen, provide a valid email address for the account. Use the address of an existing email account that you have access to and is not currently associated with an existing Azure subscription.

    Alternatively, you can use a phone number instead of an email. If you prefer to use a phone number, click Use a phone number instead, and the form will allow you to enter it. Again, use an existing phone number of a smart device that you have access to and that is not currently associated with an Azure account.

    You can even create a new email account. If you prefer to use a new email account, click Get a new email address, and the form will allow you to create a new outlook.com or hotmail.com email account. When ready, click Next to continue.

  3. On the next screen, create a password for the account. It is recommended that you follow the standard recommendations on how to create a safe password.
  4. On the next screen, provide your country/region information and the date of your birth. When ready, click Next to continue.

    An automated confirmation email will be sent to the email address you used earlier, or a message will be sent to your phone if you used a phone number.

  5. Copy the code you were sent in the message from the Microsoft account team into the text box of the Verify email form. When ready, click Next to continue.
  6. On the next screen, you will be required to prove that you are a real person by entering the characters you recognize in the screenshot.

    If the characters are difficult to read, click New for a new image, or click Audio for the images to be read out through your audio device.

    When you have entered the correct characters, click Next to continue.

  7. After the account has been created, the Welcome to Azure web page will open, prompting you to start the tour. Click Maybe later to close the dialog.

    In a new free Azure account, none of the paid services can be accessed until the trial is activated. You do not have to activate it at this time; this can be done when you go through the recipes in the following chapters in which paid Azure services are required.

    You can close the internet browser. However, if you do want to activate the trial, continue on to Step 8.

  8. To start the free trial, click Start:
    Figure 1.25 – Activating the Microsoft Azure Free Trial

    Figure 1.25 – Activating the Microsoft Azure Free Trial

  9. Follow the instructions on the web page. You will have to provide some of your personal information, including an email address and a valid phone number.

    Important note

    If Azure is not available in your region, please contact Microsoft, and they will be able to assist you further.

    To complete the activation, your identity will be verified by phone using the method you select – via a text message or through a call from the service center.

    As part of the identity verification, you will also have to provide valid credit card details.

    Important note

    Credit card information will not be used unless you upgrade your free subscription to a pay-as-you-go subscription.

    Before you can complete the activation procedure, you also have to read and accept the subscription agreement.

    Important note

    We recommend that you familiarize yourself with the subscription agreement, the details of the offer, and the privacy statement before continuing.

  10. When ready, click Sign up to activate your Azure free trial.

The activation should complete within a few minutes.

How it works…

You have just created a free Azure subscription; it will not only allow you to complete most of the recipes in this cookbook but will also allow you to explore additional services that you can use for your personal or professional needs.

Important note

One of the purposes of this cookbook is to introduce SSIS development in the cloud, using more than just the basic features of SQL Server or Azure. Unfortunately, the free Azure subscription does not provide access to more advanced features (such as the Azure Kubernetes Service or Azure Databricks); therefore, in order to complete specific recipes, you will have to upgrade your account.

Each of those recipes will contain a special warning and provide the appropriate instructions on how to proceed.

There's more…

If you decide on using Azure for more than just learning about SSIS development, you will eventually have to upgrade your account. If you design applications or provide services that can also be hosted in Azure, you might even consider joining the Microsoft Partner Network yourself.

About the Authors

  • Christian Cote

    Christian Cote is an IT professional with more than 15 years of experience working in a data warehouse, Big Data, and business intelligence projects. Christian developed expertise in data warehousing and data lakes over the years and designed many ETL/BI processes using a range of tools on multiple platforms. He's been presenting at several conferences and code camps. He currently co-leads the SQL Server PASS chapter. He is also a Microsoft Data Platform Most Valuable Professional (MVP).

    Browse publications by this author
  • Matija Lah

    Matija Lah has more than eighteen years of experience working with Microsoft SQL Server, mostly from architecting data-centric solutions in the legal domain. His contributions to the SQL Server community have led to the Microsoft Most Valuable Professional award (Data Platform) between 2007 and 2017/2018. He spends most of his time on projects involving advanced information management, and natural language processing, but often finds time to spea

    Browse publications by this author
  • Madina Saitakhmetova

    Madina Saitakhmetova is a developer specializing in BI. She has been in IT for 15 years, working first with Microsoft SQL, .Net Framework, and then Microsoft BI, BIML and Azure. Her adventure with Microsoft BI began with Analysis Services and SSIS, but she is leaning towards ETL development, both on premises and in the cloud, in later years. Finding patterns, automating processes and making BI team work more efficient are challenges that drive her. During past few years, BIML has become an important part of her work, increasing efficiency and quality

    Browse publications by this author
Book Title
Access this book, plus 8,000 other titles for FREE
Access now