Reader small image

You're reading from  SQL Query Design Patterns and Best Practices

Product typeBook
Published inMar 2023
Reading LevelIntermediate
PublisherPackt
ISBN-139781837633289
Edition1st Edition
Languages
Right arrow
Authors (6):
Steve Hughes
Steve Hughes
author image
Steve Hughes

Steve Hughes is a senior director of data and analytics at 3Cloud. In this role, he focuses on growing the teams' skills and capabilities to deliver data projects with Azure. In his 20 years of experience in technology, much of the time was spent on creating business intelligence (BI) solutions and helping customers implement BI and cloud solutions. He is passionate about helping customers understand that data is valuable and profitable. Steve has recently been diagnosed with ALS but continues to work and share with others what he has learned. Steve is also the founder of Data on Wheels where he blogs with one of his daughters on topics such as data, analytics, and work enablement
Read more about Steve Hughes

Dennis Neer
Dennis Neer
author image
Dennis Neer

Dennis Neer is a senior architect of data and analytics at 3Cloud. In this role, he focuses on working with clients to design Azure solutions for their data and analytic needs to improve their business decision-making process. This includes using tools such as SQL Server databases, Synapse, data lakes, and Power BI. In his 30 years of experience in technology, much of the time was spent on designing and building database and visualization solutions. He is passionate about helping businesses to understand data and use it to their advantage during their decision-making process
Read more about Dennis Neer

Dr. Ram Babu Singh
Dr. Ram Babu Singh
author image
Dr. Ram Babu Singh

Dr. Ram Babu Singh is a Microsoft Certified Professional with a Ph.D. in computer science. He is a lead architect of data and analytics at 3Cloud, using complex data analytics and data science solutions. In his 20 years of career, over a decade was spent in leadership positions providing data science and data platform solutions to the world's top IT consulting companies. He has been published in international journals and has a patent in his name
Read more about Dr. Ram Babu Singh

Shabbir H. Mala
Shabbir H. Mala
author image
Shabbir H. Mala

Shabbir H. Mala is a director of data and analytics at 3Cloud. He has over 23 years of experience in thought leadership and consulting, developing complex data solutions, business systems, and processes using Azure Data Services and Power BI. He currently manages a team of over 40 principals and senior architects focusing on business growth, successful project delivery, and client experience. He has done talks at Power BI conferences as well as user groups. He has been awarded Microsoft FastTrack Solution Architect in 2021 and 2022. He is currently living in Chicago, married, and has three beautiful kids
Read more about Shabbir H. Mala

Leslie Andrews
Leslie Andrews
author image
Leslie Andrews

Andrews is a lead data architect of data and analytics at 3Cloud. Working in the IT sector for over 20 years in local government, electrical supply, law enforcement, and healthcare, she has broad experience to draw from to assist clients in making data more accessible for end users. Leslie's expertise includes SQL development, ETL, data warehousing, data modeling, and analytics. Leslie currently holds several Microsoft certifications related to Azure technologies for data engineering. She approaches each day as an opportunity to use data and technology to make it easier for others to do their daily work. Leslie has been doing public speaking since 2015, was an IDERA ACE in 2019, and is a supporter of and contributor to the SQL community
Read more about Leslie Andrews

Chi Zhang
Chi Zhang
author image
Chi Zhang

Chi Zhang is a lead data architect at 3Cloud. After completing her master's degree at Carnegie Mellon University, she worked in data consulting for the past 5 years. She has helped clients from various industries to deliver customized data solutions within Azure. Focusing on ETL, data warehousing, data modeling, and BI reporting, Chi solves data problems for clients and builds data solutions that provide business users with better operational insights and a smoother experience. Recently, she has continued to grow her footprint in the Azure community: earning certifications in Azure data engineering and data science, giving her first public tech talk, co-authoring a technical book, and becoming an active contributor to the SQL community
Read more about Chi Zhang

View More author details
Right arrow

Exploring the impact of row and column reductions on aggregations

Now you know how to reduce the number of rows and specify the columns that you need in your result set, let’s talk about what the impact will be on any aggregations that you may be interested in.

First of all, based on this result set, you can view the number of backorders by any combination of columns. For example, to see the number of backorders based on year and month, you could use the following query:

SELECT Year([Order Date Key]) as "Order Year",
       Month([Order Date Key]) as "Order Month",
  COUNT([WWI Backorder ID]) as "Number of backorders",
  COUNT(distinct [Customer Key]) as "Impacted Customers",
  COUNT([Order Key]) as "Number of orders"
FROM [WideWorldImportersDW].[Fact].[Order]
WHERE [WWI Backorder ID] IS NOT NULL
GROUP BY Year([Order Date Key]),
         Month([Order Date Key])
ORDER BY Year([Order Date Key]),
         Month([Order Date Key]);

You could also run the following query using the view that you created, and you will get the same results:

SELECT [Order Year],
       [Order Month],
   COUNT([WWI Backorder]) as "number of backorders",
   COUNT([customer]) as "number of impacted customers",
   COUNT([Order]) as "number of orders"
FROM v_backorders
GROUP BY [Order Year],
     [Order Month];

Figure 1.6 shows a subset of the results from the query:

Figure 1.6 – Subset of results from the earlier queries

Figure 1.6 – Subset of results from the earlier queries

On closer investigation, you will notice that the values in the number of backorders and number of orders columns are the same. Why is this? The answer is in the filter; it only returns records that have an associated backorder, so you will not get the actual number of orders that have been placed, and any calculations will not be accurate. So, to get an accurate number of orders, you will need to get all the orders in the table, not just the orders associated with a backorder. You will also see that the order of the result set is different for the query that uses the view and the query that does not use the view. This is due to the query that uses the view being stored in a memory heap from the view and accessed on demand, whereas the query that does not use the view is stored on the disk in the primary key order of the table.

This is the impact of using a filter; to get around this, you can add a subquery to remove the filter. So here is how you can update your query to get that additional information:

SELECT Year([Order Date Key]) as [Order Year],
       Month([Order Date Key]) as [Order Month],
   COUNT(distinct [Customer Key]) as [Impacted Customers],
   COUNT(distinct [Stock Item Key]) as [Backorder Items],
   COUNT([WWI Backorder ID]) as [Number of backorders],
   fo.orders,
       fo.Customers
FROM [WideWorldImportersDW].[Fact].[Order] o,
     (select  Year([Order Date Key]) as [Order Year],
              Month([Order Date Key]) as [Order Month],
              COUNT (distinct [Order Key]) as [orders],
              COUNT (distinct [customer Key]) as [Customers]
      FROM [WideWorldImportersDW].[Fact].[Order]
  GROUP BY Year([Order Date Key]),
               Month([Order Date Key])) as fo
WHERE [WWI Backorder ID] IS NOT NULL
AND year(o.[Order Date Key]) = fo.[Order Year]
AND month(o.[Order Date Key]) = fo.[Order Month]
GROUP BY Year([Order Date Key]),
         Month([Order Date Key]),
    fo.orders,
    fo.Customers
ORDER BY Year([Order Date Key]),
         Month([Order Date Key]);

This will give you the following result set, as seen in Figure 1.7. Notice you now see values in the orders, Impacted Customers, Number of backorders, and Customers columns for each Order Month:

Figure 1.7 – Results of using filters in a query

Figure 1.7 – Results of using filters in a query

This can be done for any number of columns as long as the subquery is grouped by the same fields as the main query. To simplify this query, you could very easily create a view of all the orders and then use the views to get the same results with less query development.

Here is a sample of the query to create the order view that you can use in future queries:

CREATE VIEW [dbo].[v_orders] AS
SELECT Year([Order Date Key]) as [Order Year],
      MONTH([Order Date Key]) as [Order Month],
      COUNT(distinct [Order Key]) as [orders],
      COUNT(distinct [customer Key]) as [Customers]
FROM [WideWorldImportersDW].[Fact].[Order]
GROUP BY Year([Order Date Key]),
         Month([Order Date Key])

Here is a sample of the query that uses the two views (v_Backorders and v_orders) that have been created in this chapter:

  SELECT o.[Order Year],
         o.[Order Month],
         o.Customers,
         o.orders,
     COUNT(b.[WWI Backorder]) as [total backorders],
     COUNT(distinct b.[customer]) as [impacted customers]
  FROM [WideWorldImportersDW].[dbo].[v_Backorders] b,
  [WideWorldImportersDW].[dbo].[v_orders] o
  WHERE b.[Order Year] = o.[Order Year]
  AND b.[Order Month] = o.[Order Month]
  GROUP BY o.[Order Year],
           o.[Order Month],
  o.Customers,
  o.orders
  ORDER BY o.[Order Year],
           o.[Order Month];

The following is the result of the query:

   Figure 1.8 – Sample results of using multiple views in a query

Figure 1.8 – Sample results of using multiple views in a query

So as you can see, it is easy to reduce the amount of data that you bring into your result, but sometimes, you will need to include other data that may have been filtered out. This is where views come in handy and allow you to include data that may have been filtered in previous activities. You will need to use caution when deciding between using a view over a subquery due to the performance implications. In this example, the data size is small enough that the performance implications are negligible.

As you can see from this section, there are multiple ways in which you will be able to create result sets and get the same results.

Previous PageNext Page
You have been reading a chapter from
SQL Query Design Patterns and Best Practices
Published in: Mar 2023Publisher: PacktISBN-13: 9781837633289
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (6)

author image
Steve Hughes

Steve Hughes is a senior director of data and analytics at 3Cloud. In this role, he focuses on growing the teams' skills and capabilities to deliver data projects with Azure. In his 20 years of experience in technology, much of the time was spent on creating business intelligence (BI) solutions and helping customers implement BI and cloud solutions. He is passionate about helping customers understand that data is valuable and profitable. Steve has recently been diagnosed with ALS but continues to work and share with others what he has learned. Steve is also the founder of Data on Wheels where he blogs with one of his daughters on topics such as data, analytics, and work enablement
Read more about Steve Hughes

author image
Dennis Neer

Dennis Neer is a senior architect of data and analytics at 3Cloud. In this role, he focuses on working with clients to design Azure solutions for their data and analytic needs to improve their business decision-making process. This includes using tools such as SQL Server databases, Synapse, data lakes, and Power BI. In his 30 years of experience in technology, much of the time was spent on designing and building database and visualization solutions. He is passionate about helping businesses to understand data and use it to their advantage during their decision-making process
Read more about Dennis Neer

author image
Dr. Ram Babu Singh

Dr. Ram Babu Singh is a Microsoft Certified Professional with a Ph.D. in computer science. He is a lead architect of data and analytics at 3Cloud, using complex data analytics and data science solutions. In his 20 years of career, over a decade was spent in leadership positions providing data science and data platform solutions to the world's top IT consulting companies. He has been published in international journals and has a patent in his name
Read more about Dr. Ram Babu Singh

author image
Shabbir H. Mala

Shabbir H. Mala is a director of data and analytics at 3Cloud. He has over 23 years of experience in thought leadership and consulting, developing complex data solutions, business systems, and processes using Azure Data Services and Power BI. He currently manages a team of over 40 principals and senior architects focusing on business growth, successful project delivery, and client experience. He has done talks at Power BI conferences as well as user groups. He has been awarded Microsoft FastTrack Solution Architect in 2021 and 2022. He is currently living in Chicago, married, and has three beautiful kids
Read more about Shabbir H. Mala

author image
Leslie Andrews

Andrews is a lead data architect of data and analytics at 3Cloud. Working in the IT sector for over 20 years in local government, electrical supply, law enforcement, and healthcare, she has broad experience to draw from to assist clients in making data more accessible for end users. Leslie's expertise includes SQL development, ETL, data warehousing, data modeling, and analytics. Leslie currently holds several Microsoft certifications related to Azure technologies for data engineering. She approaches each day as an opportunity to use data and technology to make it easier for others to do their daily work. Leslie has been doing public speaking since 2015, was an IDERA ACE in 2019, and is a supporter of and contributor to the SQL community
Read more about Leslie Andrews

author image
Chi Zhang

Chi Zhang is a lead data architect at 3Cloud. After completing her master's degree at Carnegie Mellon University, she worked in data consulting for the past 5 years. She has helped clients from various industries to deliver customized data solutions within Azure. Focusing on ETL, data warehousing, data modeling, and BI reporting, Chi solves data problems for clients and builds data solutions that provide business users with better operational insights and a smoother experience. Recently, she has continued to grow her footprint in the Azure community: earning certifications in Azure data engineering and data science, giving her first public tech talk, co-authoring a technical book, and becoming an active contributor to the SQL community
Read more about Chi Zhang