Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Azure Data and AI Architect Handbook

You're reading from  Azure Data and AI Architect Handbook

Product type Book
Published in Jul 2023
Publisher Packt
ISBN-13 9781803234861
Pages 284 pages
Edition 1st Edition
Languages
Authors (2):
Olivier Mertens Olivier Mertens
Profile icon Olivier Mertens
Breght Van Baelen Breght Van Baelen
Profile icon Breght Van Baelen
View More author details

Table of Contents (18) Chapters

Preface Part 1: Introduction to Azure Data Architect
Chapter 1: Introduction to Data Architectures Chapter 2: Preparing for Cloud Adoption Part 2: Data Engineering on Azure
Chapter 3: Ingesting Data into the Cloud Chapter 4: Transforming Data on Azure Chapter 5: Storing Data for Consumption Part 3: Data Warehousing and Analytics
Chapter 6: Data Warehousing Chapter 7: The Semantic Layer Chapter 8: Visualizing Data Using Power BI Chapter 9: Advanced Analytics Using AI Part 4: Data Security, Governance, and Compliance
Chapter 10: Enterprise-Level Data Governance and Compliance Chapter 11: Introduction to Data Security Index Other Books You May Enjoy

The Semantic Layer

Positioned after the data has been modeled in a data warehouse or lakehouse, the semantic layer acts as a performance-enhancing data querying layer for reporting, offering a more efficient and business-friendly view of the data. By employing aggregations, calculations, and business-oriented terms and concepts, the semantic layer simplifies the complexity of technical data structures and caters to the needs of business users. The end result is a well-defined data model that effectively translates the underlying data structures of the warehouse or lakehouse into business-oriented terms. The value of this is to allow reports to more quickly query and visualize data, as well as allow business people to understand the data structures and extract insights from them.

In this chapter, we’re going to cover the following main topics:

  • Multidimensional versus tabular models
  • The VertiPaq engine for tabular models
  • Modes in tabular models
  • Tools for...

Multidimensional versus tabular models

The semantic layer can be constructed using two different approaches: multi-dimensional models and tabular models. Multi-dimensional models, which represent an earlier technique, involve the creation of what are known as OLAP cubes. These cubes are designed to capture and express the multidimensional nature of the underlying data. To better understand this concept, one can envision an extension of a two-dimensional table in a spreadsheet, where data can be aggregated across not just two axes but multiple dimensions.

To illustrate the practical application of multi-dimensional models, let’s consider a scenario where a data warehouse contains detailed purchase records at the lowest level of granularity. The CEO of the organization wants to translate this data into a structure that allows them to report on revenue trends per country per product. In this case, a multidimensional cube would be created, aggregating the purchase data across...

The VertiPaq engine for tabular models

The VertiPaq engine is an in-memory database engine that is responsible for storing the data model in memory. This engine leverages advanced compression algorithms and memory management techniques to fit the data model in memory:

  • Columnar compression: VertiPaq stores data in a columnar format, where each column is compressed independently. This allows for efficient compression based on the characteristics of the data in each column. Columnar compression eliminates redundant values and exploits data patterns within a column, resulting in significant space savings.
  • Hash or dictionary encoding: VertiPaq uses dictionary encoding to compress repetitive values within a column. It creates a dictionary that maps unique values to numeric identifiers, then replaces the actual values with these compact identifiers. As a result, the storage required for repeated values is greatly reduced.
  • Run-length encoding: This technique is used to compress...

Modes in tabular models

By default, tabular models use an Import mode to load data into memory. An ETL tool such as Power Query extracts the data from data sources, transforms it, and loads it into memory. Afterward, DAX queries can be performed against the in-memory database to calculate and aggregate the data. When tabular models query data residing in memory, processing can be very fast, but the data also needs to be refreshed with the ETL tool every now and then to reflect the most recent changes.

DirectQuery mode is very different from Import mode. Queries are run against the underlying data sources instead of the in-memory database. This means the data is always up-to-date and no refreshes of the in-memory database need to be scheduled, but latencies are higher, meaning performance is worse. Another benefit of DirectQuery mode is that the data model can grow beyond the memory size limits as no copy of the data is kept in memory.

When performing a DAX query against a tabular...

Tools for the semantic layer

There are several different tools for implementing the semantic layer, as follows:

  • SQL Server Analysis Services (SSAS): SSAS is part of the SQL Server services. It is primarily deployed on-premises, meaning it is hosted within an organization’s local infrastructure. SSAS offers a robust and scalable platform for building multidimensional and tabular models. Initially, only multidimensional models were supported, but due to popular demand, tabular model support was added later. Multidimensional models are designed for complex analytical scenarios and support features such as hierarchies, calculated members, and advanced aggregation capabilities. Tabular models, on the other hand, provide a simpler and more intuitive approach to data modeling, allowing users to create analytical models using tables and relationships. SSAS also includes advanced features such as data mining and data visualization.

The following architecture diagram (Figure...

Summary

In this chapter, we discussed how the semantic layer is used to create a more performant layer for reporting. We learned the semantic layer can be implemented using either multidimensional or tabular models. Multidimensional models are based on the concept of cubes and provide analytical capabilities by aggregating the data. Tabular models, on the other hand, utilize a columnar in-memory technology known as the VertiPaq engine, which enables faster processing and compression. Tabular models are well suited for scenarios where fast query performance and self-service analytics are paramount.

The VertiPaq engine was explained as a key component of tabular models and powers their impressive performance capabilities. By leveraging in-memory storage and columnar data structures, the VertiPaq engine optimizes data compression and enables efficient query execution. This engine plays a significant role in the success of tabular models, allowing for interactive and near-real-time...

lock icon The rest of the chapter is locked
You have been reading a chapter from
Azure Data and AI Architect Handbook
Published in: Jul 2023 Publisher: Packt ISBN-13: 9781803234861
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}