The Definitive Guide to Power Query (M): Mastering complex data transformation with Power Query

Name: The Definitive Guide to Power Query (M)
Brand: Packt
SKU: 9781835089729
Rating: 5 (1 reviews)

By Gregory Deckler , Rick de Groot , Melissa de Korte

~~$43.99~~ $29.99

Book Mar 2024 758 pages 1st Edition

What do you get with eBook?

Instant access to your Digital eBook purchase

Download this book in EPUB and PDF formats

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

Buy Now

Product Details

Publication date : Mar 29, 2024

Length 758 pages

Edition : 1st Edition

Language : English

ISBN-13 : 9781835089729

Vendor :

Microsoft

Category :

Data

Languages :

Power Query (Beginner)

Concepts :

Data Analysis

Tools :

Power Query (Beginner)

View table of contents

Preview Book

Introducing M

M is a powerful and versatile formula language specifically designed for data manipulation and transformation. The term M is an informal designation. M’s official name is the Power Query Formula Language. For an explanation of this designation, see the History of M section later in this chapter. M is the language at the heart of Power Query, which is used in numerous applications like Microsoft Excel, Power BI, Power Platform, and Microsoft Fabric for data transformation and preparation.

The popularity of the M language has continuously grown over the last decade, and the language has been integrated into an impressive array of Microsoft tools and platforms. Today, M and Power Query are indispensable tools for modern data professionals such as business analysts, data scientists, and data enthusiasts.

This chapter is the beginning of your exciting journey that culminates in the mastery of the M language. We start with a brief history of M and then cover the basics of who, where, why, and how. Next, we introduce the absolute basics of the M language and finish with the formal and informal characteristics of M (in effect, what is M?). Overall, this chapter provides a firm foundation for the more in-depth exploration of the M language found throughout the rest of this book. Specifically, this chapter covers the following topics:

The history of M
Who should learn M?
Where and how Is M used?
Why learn M?
M language basics
The characteristics of M

The history of M

The process of extracting, transforming, and loading data is a challenge as old as information technology itself. Both business users and IT professionals have historically struggled with the challenge, and numerous software tools have been developed over the years to help deal with the challenge such as SQL Server Integration Services (SSIS) and Alteryx.

However, many of these tools were complex and not easily portable. The M language and Power Query were created to help solve these issues.

While there might be some more speculative history regarding the origins of M, we can at least definitively trace M back to a project originally code-named Data Explorer. Data Explorer was an Azure SQL Labs project circa 2011 that aimed to simplify the process of accessing, cleaning, and preparing data from various sources. The query language was thought of as a mashup language (hence the M for mashup).

In 2013, Microsoft released Power Query as an add-in for Excel. Power Query introduced a user-friendly interface, allowing business users to perform data transformations via a visual editor. Behind the scenes, Power Query utilized the M language as the underlying formula language to drive the data transformations, and as such, these data transformations became repeatable. Instead of, for example, business users continually performing the same manual data transformations on source data received as comma-delimited files, that process could now be effectively automated.

Following the success of Power Query in Excel, Microsoft included Power Query as part of its new product, Power BI Designer, which eventually became Power BI Desktop. As Power Query gained popularity, there was a need to standardize the underlying formula language. In 2016, Microsoft submitted the Power Query Formula Language specification to the European Computer Manufacturers Association (ECMA), an international standards organization. This effort established a formal specification for the language, ensuring compatibility and interoperability between different implementations.

While the language was formally referred to as the Power Query Formula Language, it became commonly known as M among the user community. The informal name M gained widespread acceptance and is now widely used to refer to the language.

Microsoft continues to enhance and refine the M language as part of its ongoing investment in data integration and transformation technologies. New functions, features, and improvements are periodically introduced to provide users with more powerful and efficient ways to manipulate and prepare their data. In addition, Microsoft continues to introduce M within additional software tools and platforms, such as data integration within Microsoft Power Platform and dataflows within Power BI and Fabric.

Today, the M language is a key component of Microsoft’s data transformation and integration toolset. The proliferation of M, as well as its versatility and extensibility, make it an invaluable language for today’s modern data professionals.

Let’s now turn our attention to who should learn M.

Who should learn M?

M is a powerful tool for data professionals and individuals who work with data on a regular basis. The versatility and capabilities of M make it a valuable language to learn for various roles, including the following:

Data analysts: Data analysts who deal with data extraction, transformation, and preparation tasks can greatly benefit from learning M. It provides a comprehensive set of functions and operators that enable data analysts to efficiently shape and manipulate data from diverse sources. By mastering M, data analysts can automate repetitive tasks, handle complex data transformations, and ensure data quality, leading to more accurate and reliable data analysis.
Business intelligence professionals: Professionals in the business intelligence (BI) space can greatly enhance their skills by learning M. It is a core component of Power BI. By understanding M, BI professionals gain the ability to connect to various data sources, perform complex data transformations, and create reusable data preparation workflows, enabling them to provide actionable insights and drive informed decision-making.
Data engineers: Data engineers involved in the design and implementation of data pipelines and data integration processes can really benefit from learning M. It allows data engineers to efficiently extract, transform, and load (ETL) data from different sources into data warehouses or data lakes, specifically within Power BI and Microsoft Fabric. M also provides the flexibility and power to handle complex data formats, define custom transformations, and create efficient data processing workflows. By mastering M, data engineers can streamline data integration processes and ensure data consistency and quality.
Data scientists: Data scientists who perform exploratory data analysis, model development, and advanced analytics can leverage the capabilities of M to efficiently prepare their data. M provides a robust set of functions for cleaning, shaping, and aggregating data, allowing data scientists to focus on the analytical aspects of their work. By incorporating M into their data preparation workflows, data scientists can streamline the pipeline of turning raw data into insights, spending less time on data cleansing and preparation and more time on data modeling and analysis.
Power users: Power users in Excel and Microsoft’s Power Platform who work extensively with data and perform complex data manipulations can benefit from learning M. It is integrated into Excel through Power Query, empowering users to perform advanced data transformations within the familiar Excel interface. In addition, M is integrated into the Microsoft Power Platform via data integration, allowing data to be transformed and mapped between systems. By mastering M, Power users can expand their data manipulation capabilities, automate repetitive tasks, and enhance the accuracy and reliability of their analyses.
Individuals in data-driven roles: Beyond the specific roles mentioned above, individuals in various data-driven roles, such as project managers, consultants, researchers, and domain experts, can benefit from learning M. Mastery of M provides the ability for individuals in these roles to independently handle data-related tasks, extract meaningful insights, and make informed decisions based on reliable data.

M is a valuable language for a wide range of data professionals and other individuals who work with data. Whether you are a data analyst, BI professional, data engineer, data scientist, power user, or someone in a data-driven role, learning M empowers you to efficiently ingest, transform, and prepare data for analysis.

Now that we understand what types of individuals would want to learn M, let’s next explore where those individuals can leverage their mastery of M.

Where and how is M used?

M is a versatile language included in various tools and platforms where data transformation and manipulation are essential. Its integration within the Power Query ecosystem enables users to leverage M’s capabilities in different environments. In this section, we explore some key areas where M is widely used.

Experiences

Before discussing specific products where M is used, it is important to understand the different experiences available for authoring M. There are two experiences available for authoring M, one intended for on-premises use and the other for cloud-based applications. These experiences are the following:

Power Query Desktop: Power Query Desktop is the experience for Power Query found in desktop applications such as Power BI Desktop and Microsoft Excel. While the experiences are similar, there are differences. For example, the artificial intelligence (AI) and machine learning (ML) integrations as well as the integrations with R and Python present in Power BI Desktop are not present in Microsoft Excel. Conversely, the Structured Column options available in Excel are not available in Power BI Desktop.
Power Query Online: Power Query Online, a cloud-based service, allows users to create and manage data transformations within a web browser. M is used extensively in Power Query Online to define data transformations, connect to data sources, and perform complex data manipulations. Users can access and edit M queries directly within the browser interface, making it convenient to collaborate and work on data transformation tasks from anywhere with an internet connection. Power Query Online is integrated into a variety of Microsoft products, including the Power BI service, Power Apps, Power Automate, etc.

It is important to note that while two different experiences for authoring M exist, both provide nearly the exact same user experience. Even better, both provide the ability to edit the underlying M code, which is the primary focus of this book. Thus, the skills learned here apply equally to either experience used within any product or service.

Products and services

M is ubiquitous within the Microsoft ecosystem, including the following software and services:

Dataflows: Dataflows are product-agnostic, cloud-based M queries that can be reused across multiple different products. Dataflows enable users to build and manage reusable data preparation and transformation processes. Dataflows leverage the Power Query Online experience.
Power BI Desktop: M is a fundamental component of Power BI Desktop, a leading BI tool. M allows users to connect to different data sources, perform data transformations, and create interactive visualizations and reports.

M enables users to extract, clean, and shape data from diverse sources, such as databases, Excel files, web services, and more. With M, users can define data transformation steps and create reusable queries that refresh and update data automatically when the underlying source changes.

Within Power BI Desktop, M is used within the Power Query editor, a sub-program launched from within Power BI Desktop. The Power Query editor provides a powerful graphical user interface (GUI) for working with the M formula language, as shown in the following screenshot:

A screenshot of a computer

Description automatically generated

Figure 1.1: Power Query editor in Power BI Desktop

The Power Query editor is covered in greater detail in Chapter 2, Working With Power Query/M.

Power BI Desktop also supports the use of dataflows.

Power BI/Fabric service: The Power BI/Fabric service (powerbi.com) is the cloud-based component of Power BI that enables you to share reports, dashboards, and other content. The service supports the use of M code via the creation of dataflows, using the Power Query Online experience.
To create a dataflow in the Power BI service, navigate to any workspace other than My Workspace and choose New and then Dataflow, as shown in Figure 1.2:

Figure 1.2: Create a dataflow in the Power BI service

Power BI Report Server: Power BI Report Server (PBRS) supports the Power Query Desktop experience, allowing users to create rich data transformations via M.
Excel (Windows and Macintosh): M is seamlessly integrated into Excel, empowering users to perform advanced data transformations within the familiar Excel interface. Power Query, the engine behind Excel’s data transformation capabilities, is powered by M. Users can access the Power Query editor in Excel to apply M transformations, filter and sort data, remove duplicates, merge and append tables, and perform other data preparation tasks. M allows users to clean, reshape, and enrich data in Excel, enhancing the accuracy and reliability of their analyses.

In Excel, the Power Query editor interface can be accessed by using the Data tab of the ribbon and choosing Get Data:

Figure 1.3: Get Data in Microsoft Excel

Once the data source is chosen, the Power Query Editor interface can be accessed by choosing the Transform Data button:

Figure 1.4: Transform Data option in Microsoft Excel

Both the Windows and Macintosh versions of Excel also support accessing and using dataflows.

Power Apps: Power Apps is Microsoft’s low-code platform for creating applications. Both the Power Query Online experience as well as the use of dataflows are supported. A common use case is to leverage M either via the Power Query Online experience or dataflows, allowing users to seamlessly bring their data into the Dataverse (formally Common Data Service).
Power Automate: Power Automate is Microsoft’s low code platform for automating workflows. Power Automate allows users to automate repetitive workflows and processes that may involve data manipulation and integration tasks. M can be employed within Power Automate to perform data transformations and handle complex data scenarios as part of the automated workflows, via the Power Query Online experience. By incorporating M into Power Automate, users can build sophisticated data integration and automation solutions that streamline their business processes. In addition, dataflows can be leveraged in Power Automate via Power Query Dataflows connector. This allows actions to occur once a dataflow completes and also provides the ability for a dataflow to be initiated as an action within a Power Automate flow.
Data Factory: Data Factory is a managed cloud service specifically built for complex extract-transform-load (ETL) and extract-load-transform (ELT) integration projects. Data Factory allows the creation and orchestration of data-driven workflows, data movement, and transformation at scale. Both Azure Data Factory and Data Factory in Microsoft Fabric support M code, via both the Power Query Online experience as well as dataflows.
SQL Server: SSIS supports the core M engine while SQL Server Analysis Services (SSAS) supports the Power Query Desktop experience.
Dynamics 365 Customer Insights: Customer Insights within Dynamics 365 is Microsoft’s customer data platform (CDP) that provides a holistic view of customers, enabling personalized customer experiences. Customer Insights supports both dataflows as well as the Power Query Online experience.
Visual Studio: Visual Studio allows M to be integrated as a language. This is done via the Power Query Language Service for Visual Studio Code and is available in the Visual Studio Code Marketplace. This language service provides fuzzy autocomplete, hover, function hints, and other functionality for writing M code within Visual Studio.
There is also the Visual Studio Power Query Software Development Kit (SDK). This SDK consists of a set of tools designed to help create custom Power Query data source connectors. The Visual Studio Power Query SDK is covered in greater detail in Chapter 16, Enabling Extensions.

Other data integration scenarios: M is not limited to the aforementioned software and services. M can also be leveraged in custom applications and programming environments that utilize Power Query libraries.

As you can see, M is widely used in different tools and platforms within the Microsoft ecosystem, such as Power BI Desktop, Excel, the Power BI and Fabric service, Power Platform, SQL Server, and Dynamics. M enables users to connect to various data sources, perform advanced data transformations, and automate data integration workflows. The skills learned in this book deal with the M language itself and, thus, transcend both the experience as well as the specific product or service. Thus, by mastering M, users gain the ability to create reusable data transformation processes and enhance their data manipulation capabilities across a wide range of data-related scenarios, as well as across any experience, product, or service that uses M as its underlying data transformation layer.

Let’s now turn our attention to why data professionals and other individuals might want to add M to their repertoire of language.

Why learn M?

In today’s data-driven world, efficiently and effectively transforming and analyzing data is a valuable skill. Power Query, a powerful data transformation and preparation tool, gained immense popularity due to its seamless integration with many popular software systems as well as its ease of use. At the heart of Power Query lies M, the Power Query Formula Language. But you may be asking yourself, why should you invest the time in order to learn M?

Here are seven reasons why we believe data professionals and other individuals should learn M:

Tapping into the full power of Power Query: In Gil Raviv’s book, Collect, Combine, and Transform Data Using Power Query in Excel and Power BI, Mr. Raviv estimates that the GUI for authoring M queries (see Where and how Is M used? in this chapter) allows you to solve only 40% of challenges related to data transformation, but mastery of M allows you to bring that figure closer to 99.99%. Later chapters demonstrate specific examples of solving data transformation challenges that cannot be done in the GUI. Since M serves as the backbone of Power Query’s data transformation capabilities, by mastering M, you gain full control over the data transformation process, allowing you to extract, clean, transform, and reshape data from diverse sources.
Automation of repetitive tasks: One of the primary reasons you should learn M is to automate repetitive data transformation tasks. Business and IT professionals are often tasked with receiving data on a recurring basis and then generating reports based on this data. Instead of manually transforming this data each time (often in Excel) to prepare it for reporting purposes, leveraging M for this data transformation allows the data transformation logic to be implemented once and then automatically run each time new data is received.
Flexibility and customization: While Power Query provides a user-friendly interface for data transformation tasks, it does have its limitations. By learning M, you can extend the capabilities of Power Query and overcome these limitations. M allows you to write custom functions, perform advanced transformations, and apply complex logic that goes beyond the built-in capabilities of the Power Query interface. This flexibility empowers you to tailor your data transformations precisely to meet the unique requirements of your data sources and analysis.
Efficiency and performance optimization: M is a highly efficient and optimized language for data transformations. The Power Query engine intelligently processes M expressions, optimizing performance by reducing unnecessary data loads and transformations. When working with large datasets or complex transformations, knowing M enables you to write efficient code that significantly speeds up your data processing, as demonstrated in Chapter 15, Optimizing Performance. By understanding the underlying principles of M and its performance considerations, you can optimize your data workflows and save valuable time. Finally, leveraging M can greatly reduce and simplify the formula and Data Analysis Expressions (DAX) code in downstream applications like Excel and Power BI Desktop.
Advanced data cleaning and transformation: M provides a comprehensive set of data cleaning and transformation functions that go far beyond the basic operations available in traditional spreadsheet applications. With M, you can easily handle data quality issues, such as removing duplicates, handling missing values, splitting columns, merging sets of data, and performing advanced calculations. Learning M enables you to tackle complex data cleaning and transformation tasks efficiently, leading to accurate and reliable data analysis.
Integration with other programming languages: M is not only a standalone language but also integrates well with other programming languages such as SQL, R, and Python. This integration allows you to leverage the capabilities of these languages within your Power Query workflows. You can combine M code with native SQL queries, call R or Python scripts, and seamlessly incorporate external libraries and functions into your data transformation process. By expanding your knowledge to include M, you unlock the potential to leverage the best features of various programming languages for data manipulation.
Career advancement: Proficiency in M and Power Query has become a sought-after skill in the data industry. As organizations increasingly rely on data for decision-making, individuals who possess the ability to efficiently transform, clean, and analyze data are in high demand. By investing time and effort in learning M, you position yourself as a valuable asset to organizations that rely on data-driven insights. The knowledge of M can open up new career opportunities, enhance your job prospects, and enable you to take on challenging data-related projects.

In summary, learning M enables you to efficiently ingest, transform, and analyze data from diverse sources. It provides flexibility, customization, and performance optimization capabilities that extend the functionality of Power Query itself. By mastering M, you gain a competitive edge in the data industry and open doors to new career possibilities.

We hope that you are now excited about learning M! Let’s turn our attention to the basics of the M language.

M language basics

As previously noted, M is a powerful language designed for data ingest and transformation within a variety of Microsoft software and services. Understanding the basics of the M language is essential for effectively leveraging its capabilities.

Here are some important fundamentals regarding the M language:

Expressions and functions: In M, expressions form the building blocks of data transformations. An expression represents a computation or operation that evaluates to a value. M provides a wide range of built-in functions that can be used to perform operations on data. Functions in M are called using a syntax where the function name is followed by arguments within parentheses. For example, the function Text.Start("Hello, World!", 5) returns the substring Hello from the input text. More about expressions and functions are covered in Chapter 4, Understanding Values and Expressions, as well as Chapter 9, Parameters and Custom Functions.
Data types: M supports various data types, including text, numbers, dates, times, lists, tables, and records. Understanding the data types in M is crucial for performing accurate transformations. M provides functions to convert between different data types and manipulate data, based on their inherent characteristics. For example, the Text.From function converts a value to text, while the Date.Year function extracts the year component from a date or datetime value. Data types are covered in Chapter 5, Understanding Data Types.
Variables and constants: M allows you to define variables and constants to store and reuse values during data transformations. Variables are created within a let expression, followed by a comma-separated list of variable assignments. Constants, on the other hand, are fixed values that remain constant throughout the execution. Variables and constants help improve code readability, enable reuse, and make complex transformations more manageable. More about variables and constants can be found throughout this book.
Operators: M supports a variety of operators to perform mathematical calculations, logical comparisons, and text manipulations. Arithmetic operators (+, -, *, /, and so on) are used for numeric calculations, while comparison operators (>, <, =, and so on) evaluate logical conditions. The combination operator ‘&' is used for concatenating text values, appending lists and tables, or merging records. Operators are covered in Chapter 4, Understanding Values and Expressions.
Step-by-step transformation process: M follows a step-by-step transformation process where each step defines a data transformation operation. The Power Query editor provides a visual interface to define these steps and generates the corresponding M code. Steps can include operations such as filtering rows, removing duplicates, splitting columns, merging tables, and aggregating data. Chapter 2, Working with Power Query/M, covers this topic in more detail.
Query folding: Query folding is an optimization technique in Power Query that pushes data transformations to the data source whenever possible. When using M, it is important to be aware of query folding to ensure efficient data processing. Query folding can improve performance by reducing data transfer between the data source and Power Query. However, not all transformations can be folded, so it is essential to understand which operations can be folded and which cannot. For example, when using Direct Query or Dual storage mode for tables, all M queries must fold, which can limit certain transformation operations. Query folding is discussed in Chapter 7, Conceptualizing M, and in Chapter 15, Optimizing Performance.
Error handling and debugging: M provides error handling mechanisms to catch and handle exceptions during data transformations. By using functions like try, otherwise, and error, you can control the flow of execution and handle potential errors gracefully. Additionally, M supports debugging capabilities, such as the ability to step through the code to identify and resolve issues in complex transformations. Error handling and debugging are covered in Chapter 12, Handling Errors and Debugging.
Case sensitivity: M is case-sensitive. This applies to all functions, expressions, variables, constants, and other aspects of the M language.
Commenting: Comments in M follow the C language commenting style. Inline comments are proceeded by double slashes ( // ) while block comments use the slash-asterisk/asterisk-slash pattern ( /* and */ ).

Now that we have a good understanding of the core components of the M language, let’s next explore the most fundamental component of the M language, the let expression.

The let expression

At the core of the M language is the let expression, which must be paired with an in expression. In simple terms, the let expression contains the input and transformations, while the in expression contains the output. A simple Hello World for M looks like the following:

let
    Hello = "Hello World"
in
    Hello

This code would return the ubiquitous Hello World text.

It is important to note that every expression within a let statement must be followed by a comma ( , ) except the last expression prior to the in expression. Thus, if the let expression consists of multiple sub-expressions, then the code might look like the following:

let
    Hello = "Hello",
    World = "World",
    Return = Hello & " " & World
in
    Return

This code also returns Hello World as output.

Understanding the basics of M, including expressions, functions, data types, variables, operators, and the step-by-step transformation process, is vital for effectively manipulating and preparing data. By mastering these foundational concepts, you gain the ability to perform complex transformations, optimize data workflows, and unlock the full potential of the M language. The rest of this book is devoted to helping you master all of these foundational concepts and how to apply them to complex data transformations.

The characteristics of M

M is a programming language that serves as the backbone of Power Query, enabling users to extract, clean, and reshape data from various sources, such as databases, spreadsheets, web pages, and more. However, unlike general-purpose programming languages such as C, C#, Java, and Python, which are designed for a wide variety of applications, M is a domain-specific language, specifically designed for data ingest and manipulation. As such, M provides a rich set of functions, operators, and expressions that allow you to perform complex data transformations, calculations, and aggregations. Let’s understand this better by taking a look at the characteristics of M from both formal and informal perspectives.

Formal classification

Programming languages are classified according to a number of properties, such as pure/impure, lower-order/higher order, statically/dynamically typed, strongly/weakly typed, eager/lazy evaluation, and imperative/functional.

Microsoft has described M as:

Mostly pure: A programming language is said to be pure if it provides referential integrity. In other words, any expression can be replaced with that expression’s value without changing the program’s behavior or meaning.
An impure programming language allows side effects, which are actions that cause changes outside the scope of a function’s return value. In the case of M, it is commonly used for data transformation and retrieval tasks, which often involve interacting with external data sources, performing operations on data, and producing output. These actions constitute side effects because they affect the state of the data source or produce output beyond the function’s return value.

While M provides functional programming constructs and supports immutability, allowing for the creation of pure functions, the language is not purely functional due to its impure nature. It embraces a combination of functional and imperative programming paradigms to facilitate efficient and practical data manipulation and retrieval.

Higher-order: For lower-order languages, such as machine code or assembly language, each programming statement corresponds to a single instruction for the computer, while in higher-order languages, each statement corresponds to multiple instructions for the computer.
Higher-order languages typically allow such things as functions, objects, and modules to be used as values within a program. Higher-order languages often treat functions as first-class citizens. Specifically, this means that functions can be assigned to variables, passed as arguments to other functions, and returned as values from functions.

Power Query M supports higher-order programming by allowing you to define and manipulate functions as values. You can assign functions to variables, pass functions as arguments to other functions, and return functions as results. This enables you to create more modular and flexible code by abstracting and reusing function logic.

With higher-order programming capabilities, M allows you to apply transformations and computations dynamically based on input parameters, control flow, and data characteristics. You can write functions that operate on other functions, enabling powerful data manipulation and transformation scenarios.

For example, you can use higher-order functions in M to dynamically apply a series of transformations to a dataset, based on user-defined criteria, or to create reusable function pipelines for data processing.

By providing higher-order programming features, M empowers developers to write expressive and modular code, making it easier to work with complex data transformations and customize the behavior of functions to suit specific requirements.

Dynamically typed: Dynamically typed languages perform type checking at runtime instead of at compile time, as is the case with statically typed languages. Type checking is simply the process of ensuring that things such as parameters passed to a function are of the correct type, such as text, a number, or a date.
Weakly typed: While closely related to the property of statically/dynamically typed, strongly typed and weakly typed refer to something quite different. Strongly typed languages are extremely sensitive to type compatibility and require explicit type definitions for variables before being used. Conversely, weakly typed languages like M do not require explicit type definitions, and some even perform automatic type conversion.
M is not as weakly typed as programming languages such as Python, since variables are immutable once calculated. Thus, the weak type definition for M generally refers to the ability to use variables whose data types have not been explicitly specified.

Consider the following Python code:
```
a = 42
a = "Hello World"
print(a)
```
This code would not generate an error in Python, even though two different data types are assigned to the variable a. However, similar code is not possible in M, since variables, once calculated, are immutable (i.e., cannot be changed).

The flexibility in data typing allows Power Query M to handle a wide range of data sources and perform various data transformations effectively. It simplifies the process of working with heterogeneous datasets that may contain different data types and structures.

It is important to note that even though Power Query M is weakly typed, it still performs type checking during execution (runtime) to ensure the consistency of operations. If a particular operation is not compatible with the inferred type of a value, an error may occur at runtime.

Overall, the weakly typed nature of Power Query M strikes a balance between flexibility and data integrity, providing users with a versatile language for data transformation tasks.

Partially lazy: In general, M follows an eager evaluation strategy, meaning that when you define transformations or computations, those transformations and computations are performed immediately as you apply them to the data. This eager evaluation approach ensures that data transformations occur promptly and that the results are readily available for further processing or analysis. Power Query M is designed to efficiently handle data manipulation and retrieval tasks, focusing on immediate evaluation to provide real-time feedback on transformations.
The term partially lazy refers to a specific feature within M called lazy evaluation, which is distinct from the overall evaluation strategy of the language. In M, lazy evaluation is applied to expressions within some specific constructs, specifically List, Record, and Table expressions as well as the let expression. These constructs allow you to define expressions that are evaluated only when needed, providing a form of on-demand or lazy evaluation within those contexts.

M also allows you to define optional arguments for functions. These optional arguments are evaluated lazily, meaning they are not computed unless explicitly used within the function body. Lazy evaluation of optional arguments helps optimize performance by avoiding unnecessary computations for optional values that are not actually used within the function. It ensures that the computations for optional arguments are deferred until their values are required within the function’s execution.

M also supports conditional branching constructs like if-then-else statements. Only the branch that matches a condition is evaluated, while the other branch is not computed, resulting in lazy evaluation. This is in contrast to an eager evaluation strategy where both branches are evaluated regardless of the condition’s outcome. This form of lazy evaluation within the if-then-else construct allows for efficient computation by avoiding unnecessary evaluations of expressions in the non-matching branch.

It is important to note that while M has these partially lazy features, the overall evaluation strategy of the language remains predominantly eager. Most expressions in Power Query M are eagerly evaluated, ensuring that data transformations occur promptly, and results are immediately available for further processing. Thus, M is primarily an eager programming language, but it incorporates partial lazy evaluation in specific constructs, such as for the List, Record, Table, and let expressions, as well as optional function arguments and conditional branching. These partially lazy evaluations offer flexibility and optimize performance within those contexts.

Functional: M incorporates many functional programming concepts and features, making it a functional programming language. These functional concepts and features include immutability, higher-order functions, function composition, pure functions, and recursion.
M encourages immutability, meaning that data values are not modified in place but transformed into new values. This promotes the functional programming principle of avoiding side effects.

M supports higher-order functions, allowing functions to be treated as first-class values. You can pass functions as arguments to other functions, return functions from functions, and store functions in variables.

M facilitates function composition, enabling you to combine multiple functions to create more complex transformations. This composability is a characteristic of functional programming.

M promotes the use of pure functions, which have no side effects and produce the same output for the same input. Pure functions make code more predictable and easier to troubleshoot.

M supports recursion. While recursion is not as extensively supported as in some other functional languages, M does offer limited support for recursive functions, allowing developers to solve problems through recursive techniques.

In terms of comparison with other languages, M is perhaps most similar to F#, a programming language developed and implemented by Don Syme of Microsoft Research, Cambridge, UK.

Now that we have covered the formal classification of the M language, let’s next take a look at some more informal characteristics of M.

Informal characteristics of M

More informally, here are some key characteristics and features of M:

Functional language: M is a functional language, meaning it is based on the concept of functions as the primary building blocks for data transformations. Functions in M can be combined, nested, and composed to perform intricate data manipulations. M provides over 700 built-in functions for common operations, as well as the ability to create custom functions tailored to your specific needs. Many of these functions, as well as custom functions, are covered in later chapters.
The extensive collection of built-in functions in M provides users with powerful tools to handle diverse data transformation scenarios. These functions are designed to simplify common data manipulation tasks and enable users to efficiently transform and shape their data.

Expressive and readable syntax: The syntax of M is designed to be intuitive and easy to read, making it accessible to both beginners and experienced programmers. M expressions are written in a clear and concise manner, facilitating the creation of complex data transformations without sacrificing readability. The syntax follows a step-by-step approach, allowing you to define a series of sequential transformations to be applied to your data. Later chapters demonstrate the expressive and readable syntax of M with specific examples.
Data types and values: M supports various data types, including primitive data types such as text, numbers, dates, and duration, as well as structured data types such as lists, tables, and records. It provides powerful functions to work with these data types, helping you to manipulate and transform data at a granular level. M also allows you to define and work with variables, constants, and parameters to store and reuse intermediate results during the data transformation process. Data types and values are covered in depth in Chapter 4, Understanding Values and Expressions, and Chapter 5, Understanding Data Types.
Integration with Power Query Editor: M seamlessly integrates with the Power Query Editor, providing a user-friendly interface for interacting with and developing M code. The Power Query Editor allows you to visually build data transformation steps, preview the results, and generate M code automatically. It provides a robust development environment where you can write, debug, and refine your M expressions. See Chapter 2, Working with Power Query/M, for more on this subject.
Extensibility and customization: One of the standout features of M is its extensibility. While M offers a wide range of built-in transformations, M also allows you to go beyond these capabilities and create custom transformations to suit your very own specific needs. You can define your own functions, write reusable code snippets, and create advanced data manipulation logic using M. This level of customization empowers you to handle complex data scenarios that are not covered by standard transformations accessible through the user interface. Chapter 16, Enabling Extensions, demonstrates M’s flexibility and extensibility.
Performance optimization: M is optimized for performance, enabling efficient data processing, even with large datasets. The Power Query engine intelligently evaluates and optimizes M expressions to minimize data loads and transformations, resulting in faster and more efficient data processing.

One specific performance optimization technique is called streaming semantically and is a property of List and Table expressions. Streaming semantically involves the repeated enumeration of table rows or list items. Instead of iterating through the table or list for each data transformation, streaming semantically each row of the table or item in the list is evaluated for all of the data transformations, and the results are collected as part of the output for the expression. Streaming semantically enable the transformation of datasets that do not fit within available memory.

Another performance optimization technique is called query folding. However, query folding is not a property of the M language itself. Instead, query folding is used within Power Query to push or fold data transformations back to source data systems. In essence, the transformation expressions within M are translated to equivalent transformation statements available within the source systems, such as SQL Server. This pushes the processing of transformations back to the source systems instead of the client system executing the M query. This can improve performance and efficiency by minimizing the data transfer and reducing the amount of data processed by Power Query.

By understanding the underlying principles of M and its performance considerations, you can write optimized code and improve the overall performance of your data workflows. Chapter 15, Optimizing Performance, provides more information on performance optimization, with specific examples.

In summary, M is a versatile and expressive language specifically designed for data transformation and manipulation within Power Query. Its functional nature, extensive set of functions, and integration with the Power Query editor make it a powerful tool for extracting, cleaning, and reshaping data from diverse sources.

Summary

The M language is the backbone of Microsoft Power Query’s data transformation capabilities. M is widely used within the larger Microsoft ecosystem, including a variety of software applications and web-based services. Many different types of data professionals can benefit from learning M to efficiently and effectively ingest and transform data across a wide range of scenarios. By mastering M, you gain the ability to efficiently handle complex data transformations, customize your data workflows, and optimize performance, thereby tapping into the full potential that Power Query has to offer you for your data manipulation needs.

In this chapter, we have covered the essential questions of who, what, where, why, and how M is used, as well as provided a brief introduction to the fundamental aspects of the M language. In the next chapter, we introduce you to the primary interface in which you will be writing and working with M.

Learn more on Discord

Join our community’s Discord space for discussions with the author and other readers:

https://discord.gg/vCSG5GBbyS

Download Code

Key benefits

Get comprehensive coverage of fundamental and advanced Power Query concepts
Gain hands-on experience with practical examples based on real-world problems
Delve into the intricacies of Power Query M language for enhanced data manipulation

Description

Data transformation is a critical step in building data models and business intelligence reports. Power Query is an invaluable tool for anyone who wants to master data transformation, and this book will equip you with the knowledge and skills to make the most of it. The Definitive Guide to Power Query (M) will help you build a solid foundation in the Power Query M language. As you progress through the chapters, you'll learn how to use that knowledge to implement advanced concepts and data transformations. This will set the stage for an uncompromisingly thorough exploration of the Power Query M Language. You'll also get to grips with optimizing performance, handling errors, and implementing efficient data processing techniques. As this is a hands-on guide, the practical examples in the chapters will help you gain the skills to apply Power Query to real-world problems and improve your data analysis capabilities. By the end of this book, you will be able to leverage all of Power Query's remarkable capabilities for data transformation.

What you will learn

Gain a strong understanding of Power Query fundamentals Master various functions within Power Query to perform complex data operations Acquire knowledge about values, types, and control structures in Power Query Develop proficiency in error handling techniques Learn performance optimization strategies for Power Query Apply what you’ve learned to real-world scenarios, including common troublesome patterns