M is a powerful and versatile formula language specifically designed for data manipulation and transformation. The term M is an informal designation. M’s official name is the Power Query Formula Language. For an explanation of this designation, see the History of M section later in this chapter. M is the language at the heart of Power Query, which is used in numerous applications like Microsoft Excel, Power BI, Power Platform, and Microsoft Fabric for data transformation and preparation.
The popularity of the M language has continuously grown over the last decade, and the language has been integrated into an impressive array of Microsoft tools and platforms. Today, M and Power Query are indispensable tools for modern data professionals such as business analysts, data scientists, and data enthusiasts.
This chapter is the beginning of your exciting journey that culminates in the mastery of the M language. We start with a brief history of M and then cover the basics of who, where, why, and how. Next, we introduce the absolute basics of the M language and finish with the formal and informal characteristics of M (in effect, what is M?). Overall, this chapter provides a firm foundation for the more in-depth exploration of the M language found throughout the rest of this book. Specifically, this chapter covers the following topics:
The process of extracting, transforming, and loading data is a challenge as old as information technology itself. Both business users and IT professionals have historically struggled with the challenge, and numerous software tools have been developed over the years to help deal with the challenge such as SQL Server Integration Services (SSIS) and Alteryx.
However, many of these tools were complex and not easily portable. The M language and Power Query were created to help solve these issues.
While there might be some more speculative history regarding the origins of M, we can at least definitively trace M back to a project originally code-named Data Explorer. Data Explorer was an Azure SQL Labs project circa 2011 that aimed to simplify the process of accessing, cleaning, and preparing data from various sources. The query language was thought of as a mashup language (hence the M for mashup).
In 2013, Microsoft released Power Query as an add-in for Excel. Power Query introduced a user-friendly interface, allowing business users to perform data transformations via a visual editor. Behind the scenes, Power Query utilized the M language as the underlying formula language to drive the data transformations, and as such, these data transformations became repeatable. Instead of, for example, business users continually performing the same manual data transformations on source data received as comma-delimited files, that process could now be effectively automated.
Following the success of Power Query in Excel, Microsoft included Power Query as part of its new product, Power BI Designer, which eventually became Power BI Desktop. As Power Query gained popularity, there was a need to standardize the underlying formula language. In 2016, Microsoft submitted the Power Query Formula Language specification to the European Computer Manufacturers Association (ECMA), an international standards organization. This effort established a formal specification for the language, ensuring compatibility and interoperability between different implementations.
While the language was formally referred to as the Power Query Formula Language, it became commonly known as M among the user community. The informal name M gained widespread acceptance and is now widely used to refer to the language.
Microsoft continues to enhance and refine the M language as part of its ongoing investment in data integration and transformation technologies. New functions, features, and improvements are periodically introduced to provide users with more powerful and efficient ways to manipulate and prepare their data. In addition, Microsoft continues to introduce M within additional software tools and platforms, such as data integration within Microsoft Power Platform and dataflows within Power BI and Fabric.
Today, the M language is a key component of Microsoft’s data transformation and integration toolset. The proliferation of M, as well as its versatility and extensibility, make it an invaluable language for today’s modern data professionals.
Let’s now turn our attention to who should learn M.
M is a powerful tool for data professionals and individuals who work with data on a regular basis. The versatility and capabilities of M make it a valuable language to learn for various roles, including the following:
M is a valuable language for a wide range of data professionals and other individuals who work with data. Whether you are a data analyst, BI professional, data engineer, data scientist, power user, or someone in a data-driven role, learning M empowers you to efficiently ingest, transform, and prepare data for analysis.
Now that we understand what types of individuals would want to learn M, let’s next explore where those individuals can leverage their mastery of M.
M is a versatile language included in various tools and platforms where data transformation and manipulation are essential. Its integration within the Power Query ecosystem enables users to leverage M’s capabilities in different environments. In this section, we explore some key areas where M is widely used.
Before discussing specific products where M is used, it is important to understand the different experiences available for authoring M. There are two experiences available for authoring M, one intended for on-premises use and the other for cloud-based applications. These experiences are the following:
It is important to note that while two different experiences for authoring M exist, both provide nearly the exact same user experience. Even better, both provide the ability to edit the underlying M code, which is the primary focus of this book. Thus, the skills learned here apply equally to either experience used within any product or service.
M is ubiquitous within the Microsoft ecosystem, including the following software and services:
M enables users to extract, clean, and shape data from diverse sources, such as databases, Excel files, web services, and more. With M, users can define data transformation steps and create reusable queries that refresh and update data automatically when the underlying source changes.
Within Power BI Desktop, M is used within the Power Query editor, a sub-program launched from within Power BI Desktop. The Power Query editor provides a powerful graphical user interface (GUI) for working with the M formula language, as shown in the following screenshot:
Figure 1.1: Power Query editor in Power BI Desktop
The Power Query editor is covered in greater detail in Chapter 2, Working With Power Query/M.
Power BI Desktop also supports the use of dataflows.
To create a dataflow in the Power BI service, navigate to any workspace other than My Workspace and choose New and then Dataflow, as shown in Figure 1.2:
Figure 1.2: Create a dataflow in the Power BI service
In Excel, the Power Query editor interface can be accessed by using the Data tab of the ribbon and choosing Get Data:
Figure 1.3: Get Data in Microsoft Excel
Once the data source is chosen, the Power Query Editor interface can be accessed by choosing the Transform Data button:
Figure 1.4: Transform Data option in Microsoft Excel
Both the Windows and Macintosh versions of Excel also support accessing and using dataflows.
There is also the Visual Studio Power Query Software Development Kit (SDK). This SDK consists of a set of tools designed to help create custom Power Query data source connectors. The Visual Studio Power Query SDK is covered in greater detail in Chapter 16, Enabling Extensions.
As you can see, M is widely used in different tools and platforms within the Microsoft ecosystem, such as Power BI Desktop, Excel, the Power BI and Fabric service, Power Platform, SQL Server, and Dynamics. M enables users to connect to various data sources, perform advanced data transformations, and automate data integration workflows. The skills learned in this book deal with the M language itself and, thus, transcend both the experience as well as the specific product or service. Thus, by mastering M, users gain the ability to create reusable data transformation processes and enhance their data manipulation capabilities across a wide range of data-related scenarios, as well as across any experience, product, or service that uses M as its underlying data transformation layer.
Let’s now turn our attention to why data professionals and other individuals might want to add M to their repertoire of language.
In today’s data-driven world, efficiently and effectively transforming and analyzing data is a valuable skill. Power Query, a powerful data transformation and preparation tool, gained immense popularity due to its seamless integration with many popular software systems as well as its ease of use. At the heart of Power Query lies M, the Power Query Formula Language. But you may be asking yourself, why should you invest the time in order to learn M?
Here are seven reasons why we believe data professionals and other individuals should learn M:
In summary, learning M enables you to efficiently ingest, transform, and analyze data from diverse sources. It provides flexibility, customization, and performance optimization capabilities that extend the functionality of Power Query itself. By mastering M, you gain a competitive edge in the data industry and open doors to new career possibilities.
We hope that you are now excited about learning M! Let’s turn our attention to the basics of the M language.
As previously noted, M is a powerful language designed for data ingest and transformation within a variety of Microsoft software and services. Understanding the basics of the M language is essential for effectively leveraging its capabilities.
Here are some important fundamentals regarding the M language:
Text.Start("Hello, World!", 5)
returns the substring Hello
from the input text. More about expressions and functions are covered in Chapter 4, Understanding Values and Expressions, as well as Chapter 9, Parameters and Custom Functions.Text.From
function converts a value to text, while the Date.Year
function extracts the year component from a date or datetime value. Data types are covered in Chapter 5, Understanding Data Types.+
, -
, *
, /
, and so on) are used for numeric calculations, while comparison operators (>
, <
, =
, and so on) evaluate logical conditions. The combination operator ‘&
' is used for concatenating text values, appending lists and tables, or merging records. Operators are covered in Chapter 4, Understanding Values and Expressions.try
, otherwise
, and error
, you can control the flow of execution and handle potential errors gracefully. Additionally, M supports debugging capabilities, such as the ability to step through the code to identify and resolve issues in complex transformations. Error handling and debugging are covered in Chapter 12, Handling Errors and Debugging.//
) while block comments use the slash-asterisk/asterisk-slash pattern ( /*
and */
).Now that we have a good understanding of the core components of the M language, let’s next explore the most fundamental component of the M language, the let
expression.
At the core of the M language is the let
expression, which must be paired with an in
expression. In simple terms, the let
expression contains the input and transformations, while the in
expression contains the output. A simple Hello World
for M looks like the following:
let
Hello = "Hello World"
in
Hello
This code would return the ubiquitous Hello World
text.
It is important to note that every expression within a let statement must be followed by a comma ( ,
) except the last expression prior to the in
expression. Thus, if the let
expression consists of multiple sub-expressions, then the code might look like the following:
let
Hello = "Hello",
World = "World",
Return = Hello & " " & World
in
Return
This code also returns Hello World
as output.
Understanding the basics of M, including expressions, functions, data types, variables, operators, and the step-by-step transformation process, is vital for effectively manipulating and preparing data. By mastering these foundational concepts, you gain the ability to perform complex transformations, optimize data workflows, and unlock the full potential of the M language. The rest of this book is devoted to helping you master all of these foundational concepts and how to apply them to complex data transformations.
M is a programming language that serves as the backbone of Power Query, enabling users to extract, clean, and reshape data from various sources, such as databases, spreadsheets, web pages, and more. However, unlike general-purpose programming languages such as C, C#, Java, and Python, which are designed for a wide variety of applications, M is a domain-specific language, specifically designed for data ingest and manipulation. As such, M provides a rich set of functions, operators, and expressions that allow you to perform complex data transformations, calculations, and aggregations. Let’s understand this better by taking a look at the characteristics of M from both formal and informal perspectives.
Programming languages are classified according to a number of properties, such as pure/impure, lower-order/higher order, statically/dynamically typed, strongly/weakly typed, eager/lazy evaluation, and imperative/functional.
Microsoft has described M as:
An impure programming language allows side effects, which are actions that cause changes outside the scope of a function’s return value. In the case of M, it is commonly used for data transformation and retrieval tasks, which often involve interacting with external data sources, performing operations on data, and producing output. These actions constitute side effects because they affect the state of the data source or produce output beyond the function’s return value.
While M provides functional programming constructs and supports immutability, allowing for the creation of pure functions, the language is not purely functional due to its impure nature. It embraces a combination of functional and imperative programming paradigms to facilitate efficient and practical data manipulation and retrieval.
Higher-order languages typically allow such things as functions, objects, and modules to be used as values within a program. Higher-order languages often treat functions as first-class citizens. Specifically, this means that functions can be assigned to variables, passed as arguments to other functions, and returned as values from functions.
Power Query M supports higher-order programming by allowing you to define and manipulate functions as values. You can assign functions to variables, pass functions as arguments to other functions, and return functions as results. This enables you to create more modular and flexible code by abstracting and reusing function logic.
With higher-order programming capabilities, M allows you to apply transformations and computations dynamically based on input parameters, control flow, and data characteristics. You can write functions that operate on other functions, enabling powerful data manipulation and transformation scenarios.
For example, you can use higher-order functions in M to dynamically apply a series of transformations to a dataset, based on user-defined criteria, or to create reusable function pipelines for data processing.
By providing higher-order programming features, M empowers developers to write expressive and modular code, making it easier to work with complex data transformations and customize the behavior of functions to suit specific requirements.
M is not as weakly typed as programming languages such as Python, since variables are immutable once calculated. Thus, the weak type definition for M generally refers to the ability to use variables whose data types have not been explicitly specified.
Consider the following Python code:
a = 42
a = "Hello World"
print(a)
This code would not generate an error in Python, even though two different data types are assigned to the variable a
. However, similar code is not possible in M, since variables, once calculated, are immutable (i.e., cannot be changed).
The flexibility in data typing allows Power Query M to handle a wide range of data sources and perform various data transformations effectively. It simplifies the process of working with heterogeneous datasets that may contain different data types and structures.
It is important to note that even though Power Query M is weakly typed, it still performs type checking during execution (runtime) to ensure the consistency of operations. If a particular operation is not compatible with the inferred type of a value, an error may occur at runtime.
Overall, the weakly typed nature of Power Query M strikes a balance between flexibility and data integrity, providing users with a versatile language for data transformation tasks.
The term partially lazy refers to a specific feature within M called lazy evaluation, which is distinct from the overall evaluation strategy of the language. In M, lazy evaluation is applied to expressions within some specific constructs, specifically List
, Record
, and Table
expressions as well as the let
expression. These constructs allow you to define expressions that are evaluated only when needed, providing a form of on-demand or lazy evaluation within those contexts.
M also allows you to define optional arguments for functions. These optional arguments are evaluated lazily, meaning they are not computed unless explicitly used within the function body. Lazy evaluation of optional arguments helps optimize performance by avoiding unnecessary computations for optional values that are not actually used within the function. It ensures that the computations for optional arguments are deferred until their values are required within the function’s execution.
M also supports conditional branching constructs like if-then-else statements. Only the branch that matches a condition is evaluated, while the other branch is not computed, resulting in lazy evaluation. This is in contrast to an eager evaluation strategy where both branches are evaluated regardless of the condition’s outcome. This form of lazy evaluation within the if-then-else construct allows for efficient computation by avoiding unnecessary evaluations of expressions in the non-matching branch.
It is important to note that while M has these partially lazy features, the overall evaluation strategy of the language remains predominantly eager. Most expressions in Power Query M are eagerly evaluated, ensuring that data transformations occur promptly, and results are immediately available for further processing. Thus, M is primarily an eager programming language, but it incorporates partial lazy evaluation in specific constructs, such as for the List
, Record
, Table
, and let
expressions, as well as optional function arguments and conditional branching. These partially lazy evaluations offer flexibility and optimize performance within those contexts.
M encourages immutability, meaning that data values are not modified in place but transformed into new values. This promotes the functional programming principle of avoiding side effects.
M supports higher-order functions, allowing functions to be treated as first-class values. You can pass functions as arguments to other functions, return functions from functions, and store functions in variables.
M facilitates function composition, enabling you to combine multiple functions to create more complex transformations. This composability is a characteristic of functional programming.
M promotes the use of pure functions, which have no side effects and produce the same output for the same input. Pure functions make code more predictable and easier to troubleshoot.
M supports recursion. While recursion is not as extensively supported as in some other functional languages, M does offer limited support for recursive functions, allowing developers to solve problems through recursive techniques.
In terms of comparison with other languages, M is perhaps most similar to F#, a programming language developed and implemented by Don Syme of Microsoft Research, Cambridge, UK.
Now that we have covered the formal classification of the M language, let’s next take a look at some more informal characteristics of M.
More informally, here are some key characteristics and features of M:
The extensive collection of built-in functions in M provides users with powerful tools to handle diverse data transformation scenarios. These functions are designed to simplify common data manipulation tasks and enable users to efficiently transform and shape their data.
One specific performance optimization technique is called streaming semantically and is a property of List
and Table
expressions. Streaming semantically involves the repeated enumeration of table rows or list items. Instead of iterating through the table or list for each data transformation, streaming semantically each row of the table or item in the list is evaluated for all of the data transformations, and the results are collected as part of the output for the expression. Streaming semantically enable the transformation of datasets that do not fit within available memory.
Another performance optimization technique is called query folding. However, query folding is not a property of the M language itself. Instead, query folding is used within Power Query to push or fold data transformations back to source data systems. In essence, the transformation expressions within M are translated to equivalent transformation statements available within the source systems, such as SQL Server. This pushes the processing of transformations back to the source systems instead of the client system executing the M query. This can improve performance and efficiency by minimizing the data transfer and reducing the amount of data processed by Power Query.
By understanding the underlying principles of M and its performance considerations, you can write optimized code and improve the overall performance of your data workflows. Chapter 15, Optimizing Performance, provides more information on performance optimization, with specific examples.
In summary, M is a versatile and expressive language specifically designed for data transformation and manipulation within Power Query. Its functional nature, extensive set of functions, and integration with the Power Query editor make it a powerful tool for extracting, cleaning, and reshaping data from diverse sources.
The M language is the backbone of Microsoft Power Query’s data transformation capabilities. M is widely used within the larger Microsoft ecosystem, including a variety of software applications and web-based services. Many different types of data professionals can benefit from learning M to efficiently and effectively ingest and transform data across a wide range of scenarios. By mastering M, you gain the ability to efficiently handle complex data transformations, customize your data workflows, and optimize performance, thereby tapping into the full potential that Power Query has to offer you for your data manipulation needs.
In this chapter, we have covered the essential questions of who, what, where, why, and how M is used, as well as provided a brief introduction to the fundamental aspects of the M language. In the next chapter, we introduce you to the primary interface in which you will be writing and working with M.
Join our community’s Discord space for discussions with the author and other readers:
Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.
If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.
Please Note: Packt eBooks are non-returnable and non-refundable.
Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:
If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:
Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.
You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.
Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.
When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.
For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.