Learn Power BI - Second Edition

5 (1 reviews total)
By Greg Deckler
    What do you get with a Packt Subscription?

  • Instant access to this title and 7,500+ eBooks & Videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Free Chapter
    Chapter 1: Understanding Business Intelligence and Power BI
About this book

To succeed in today's transforming business world, organizations need business intelligence capabilities to make smarter decisions faster than ever before. This updated second edition of Learn Power BI takes you on a journey of data exploration and discovery, using Microsoft Power BI to ingest, cleanse, and organize data in order to unlock key business insights that can then be shared with others.

This newly revised and expanded edition of Learn Power BI covers all of the latest features and interface changes and takes you through the fundamentals of business intelligence projects, how to deploy, adopt, and govern Power BI within your organization, and how to leverage your knowledge in the marketplace and broader ecosystem that is Power BI. As you progress, you will learn how to ingest, cleanse, and transform your data into stunning visualizations, reports, and dashboards that speak to business decision-makers.

By the end of this Power BI book, you will be fully prepared to be the data analysis hero of your organization – or even start a new career as a business intelligence professional.

Publication date:
February 2022
Publisher
Packt
Pages
458
ISBN
9781801811958

 

Chapter 1: Understanding Business Intelligence and Power BI

Power BI is a powerful ecosystem of business intelligence tools and technologies from Microsoft. But what exactly is business intelligence, anyway? Simply stated, business intelligence is all about leveraging data to make better decisions. This can take many forms and is not necessarily restricted to just business. We use data in our personal lives to make better decisions as well. For example, if we are remodeling a bathroom, we get multiple quotes from different firms. The prices and details in these quotes are pieces of data that allow us to make an informed decision in terms of which company to choose. We may also research these firms online. This is more data that ultimately supports our decision.

In this chapter, we will explore the fundamental concepts of business intelligence, as well as why business intelligence is important to organizations. In addition, we will take a high-level tour of the Power BI ecosystem, licensing, and core tools, such as Power BI Desktop and the Power BI service.

The following topics will be covered in this chapter:

  • Exploring key concepts of business intelligence
  • Discovering the Power BI ecosystem
  • Choosing the right Power BI license
  • Introducing Power BI Desktop and the Power BI service
 

Exploring key concepts of business intelligence

In the context of organizations, business intelligence is about making better decisions for your business. Unlike the example in the introduction, organizations are not generally concerned with bathrooms but rather with what can make their business more effective, efficient, and profitable. The businesses that provided those quotes on bathroom remodeling need to answer questions such as the following:

  • How can the business attract new customers?
  • How can the business retain more customers?
  • Who are the competitors and how do they compare?
  • What is driving profitability?
  • Where can expenses be diminished?

There are endless questions that businesses need to answer every day, and these businesses need data coupled with business intelligence tools and techniques to answer such questions and make effective operational and strategic decisions.

While business intelligence is a vast subject in and of itself, the key concepts of business intelligence can be broken down into five areas:

  • Domain
  • Data
  • Model
  • Analysis
  • Visualization

Domain

A domain is simply the context where business intelligence is applied. Most businesses are composed of relatively standard business functions or departments, such as the following:

  • Sales
  • Marketing
  • Manufacturing/production
  • Supply chain/operations
  • Research and development
  • Human resources
  • Accounting/finance

Each of these business functions or departments represents a domain within which business intelligence can be used to answer questions that can assist us in making better decisions.

The domain helps in narrowing down the focus regarding which questions can be answered and what decisions need to be made. For example, within the context of sales, a business might want to know which sales personnel are performing better or worse, or which customers are the most profitable. Business intelligence can provide such insights as well as help to determine which activities enable certain sales professionals to outperform others, or why certain customers are more profitable than others. This information can then be used to train and mentor sales personnel who are performing less effectively or to focus sales efforts.

Within the context of marketing, a business can use business intelligence to determine which types of marketing campaigns, such as email, radio, print, TV, and the web, are most effective in attracting new customers. This then informs the business where they should spend their marketing budget.

Within the context of manufacturing, a business can use business intelligence to determine the Mean Time Between Failure (MTBF) for machines that are used in the production of goods. This information can be used by the business to determine whether preventative maintenance would be beneficial and how often such preventative maintenance should occur.

Clearly, there are endless examples of where business intelligence can make an organization more efficient, effective, and profitable. Deciding on a domain in which to employ business intelligence techniques is a key step in enabling business intelligence undertakings within organizations, since the domain dictates which key questions can be answered, the possible benefits, as well as what data is required in order to answer those questions.

Data

Once a domain has been decided upon, the next step is identifying and acquiring the data that's pertinent to that domain. This means identifying the sources of relevant data. These sources may be internal or external to an organization and may be structured, unstructured, or semi-structured in nature.

Internal and external data

Internal data is data that is generated within an organization by its business processes and operations. These business processes can generate large volumes of data that is specific to that organization's operations. This data can take the form of net revenues, sales to customers, new customer acquisitions, employee turnover, units produced, cost of raw materials, and time series or transactional information. This historical and current data is valuable to organizations if they wish to identify patterns and trends, as well as for forecasting and future planning. Importantly, all the relevant data to a domain and question is almost never housed within a single data source; organizations inevitably have multiple sources of relevant data.

In addition to internal data, business intelligence is most effective when internal data is combined with external data. Crucially, external data is data that is generated outside the boundaries of an organization's operations. Such external data includes things such as overall global economic trends, census information, customer demographics, household salaries, and the cost of raw materials. All this data exists irrespective of any single organization.

Each domain and question will have internal and external data that is relevant and irrelevant to answering the question at hand. However, do not be fooled into believing that simply because you have chosen manufacturing/production as the domain, other domains, such as sales and marketing, do not have relevant sources of data. If you are trying to forecast the required production levels, sales data in terms of pipelines can be very relevant. Similarly, external data that points toward overall economic growth may also be extremely relevant, while data such as the cost of raw materials may very well be irrelevant.

Structured, unstructured, and semi-structured data

Structured data is data that conforms to a rather formal specification of tables with rows and columns. Think of a spreadsheet where you might have columns for the transaction ID, customer, units purchased, and price per unit. Each row represents a sales transaction. Structured data sources are the easiest sources for business intelligence tools to consume and analyze. These sources are most often relational databases, which include technologies such as Microsoft SQL Server, Microsoft Access, Azure Table storage, Azure SQL Database, Oracle, MySQL, IBM Db2, Teradata, PostgreSQL, Informix, and Sybase. In addition, this category of data sources includes relational database standards such as Open Database Connectivity (ODBC) and Object Linking and Embedding Database (OLE DB).

Unstructured data is effectively the opposite of structured data. Unstructured data cannot be organized into simple tables with rows and columns. Such data includes things such as video, audio, images, and text. Text documents, social media posts, and online reviews are also examples of largely unstructured data. Unstructured data sources are the most difficult types of sources for business intelligence tools to consume and analyze. This type of data is either stored as Binary Large Objects (BLOBSs), online files or posts, or as files in a filesystem, such as the New Technology File System (NTFS) or the Hadoop Distributed File System (HDFS).

Semi-structured data has a structure but does not conform to the formal definition of structured data, that is, tables with rows and columns. Examples of semi-structured data include tab and delimited text files, XML, other markup languages such as HTML and XSL, JavaScript Object Notation (JSON), and Electronic Data Interchange (EDI). Semi-structured data sources have a self-defining structure that makes them easier to consume and analyze than unstructured data sources but require more work than true, structured data sources.

Semi-structured data also includes so-called NoSQL databases, which include data stores such as document databases, graph databases, and key-value stores. These databases are specifically designed to store structured and unstructured data. Document databases include Microsoft Azure Cosmos DB, MongoDB, Cloudant (IBM), Couchbase, and MarkLogic. Graph databases include Neo4j and HyperGraphDB. Key-value stores include Basho Technologies' Riak, Redis, Aerospike, Amazon Web Services' DynamoDB, Couchbase, DataStax's Cassandra, and MapR Technologies. Wide-column stores include Cassandra and HBase.

Finally, semi-structured data also includes data access protocols, such as Open Data Protocol (OData) and other Representational State Transfer (REST) Application Programming Interfaces (APIs). These protocols provide interfaces to data sources such as Microsoft SharePoint, Microsoft Exchange, Microsoft Active Directory, and Microsoft Dynamics; social media systems such as Twitter and Facebook; as well as other online systems such as Mailchimp, Salesforce, Smartsheet, Twilio, Google Analytics, and GitHub, to name a few. These data protocols abstract how the data is stored, whether that is a relational database, NoSQL database, or simply a bunch of files.

Most business intelligence tools, such as Power BI, are optimized for handling structured and semi-structured data. Structured data sources integrate natively with how business intelligence tools are designed. In addition, business intelligence tools are designed to ingest semi-structured data sources and transform them into structured data. Unstructured data is more difficult but not impossible to analyze with business intelligence tools. In fact, Power BI has some features that are designed to ease the ingestion and analysis of unstructured data sources. However, analyzing such unstructured data has its limitations.

Model

A model, or data model, refers to the way in which one or more data sources are organized to support analysis and visualization. Models are built by transforming and cleansing data, helping to define the types of data within those sources, as well as the definition of data categories for specific data types. Building a model generally involves three elements:

  • Organizing
  • Transforming and cleansing
  • Defining and categorizing

Organizing

Models can be extremely simple, such as a single table with columns and rows. However, business intelligence almost always involves multiple tables of data, and often involves multiple tables of data coming from multiple sources. Thus, the model becomes more complex as the various sources and tables of data must be combined into a cohesive whole. This is done by defining how each of the disparate sources of data relates to one another. As an example, let's say you have one data source that represents a customer's name, contact information, and perhaps the size of the business by revenue and/or the number of employees. This information might come from an organization's Customer Relationship Management (CRM) system. The second source of data might be order information, which includes the customer's name, units purchased, and the price that was paid. This second source of data comes from the organization's Enterprise Resource Planning (ERP) system. These two sources of data can be related to one another based on the unique name or ID of the customer.

Some sources of data have prebuilt models. This includes traditional data warehouse technologies for structured data as well as analogous systems for performing analytics over unstructured data. The traditional data warehouse technology is generally built upon the Online Analytical Processing (OLAP) technology and includes systems such as Microsoft's Analysis Services, Snowflake, Oracle's Essbase, AtScale cubes, SAP HANA and Business Warehouse servers, and Azure Synapse. With respect to unstructured data analysis, technologies such as Apache Spark, Databricks, and Azure Data Lake Storage are used.

Transforming and cleansing

When building a data model, it is often (read: always) necessary to clean and transform the source data. Data is never clean – it must always be massaged for bad data to be removed or resolved. For example, when dealing with customer data from a CRM system, it is not uncommon to have the same customer entered with multiple spellings. The format of data in spreadsheets may make data entry easy for humans but can be unsuitable for business intelligence purposes. In addition, data may have errors, missing data, inconsistent formatting, or even have something as seemingly simple as trailing spaces. These types of situations can cause problems when performing business intelligence analysis. Luckily, business intelligence tools such as Power BI provide mechanisms for cleansing and reshaping the data to support analysis. This might involve replacing or removing errors in the data, pivoting, unpivoting, or transposing rows and columns, removing trailing spaces, or other types of transformation operations.

Transforming and cleansing technologies are often referred to as Extract, Transform, Load (ETL) tools and include products such as Microsoft's SQL Server Integration Services (SSIS), Azure Data Factory, Alteryx, Informatica, Dell Boomi, Salesforce's MuleSoft, Skyvia, IBM's InfoSphere Information Server, Oracle Data Integrator, Talend, Pentaho Data Integration, SAS's Data Integration Studio, Sybase ETL, and QlikView Expressor.

Defining and categorizing

Data models also formally define the types of data within each table. Data types generally include formats such as text, decimal number, whole number, percentage, date, time, date and time, duration, true/false, and binary. The definition of these data types is important as it defines what kind of analysis can be performed on the data. For example, it does not make sense to create a sum or average of text data types; instead, you would use aggregations such as count, first, or last.

Finally, data models also define the data category of data types. While a data type such as a postal code might be numeric or text, it is important for the model to define that the numeric data type represents a postal code. This further defines the type of analysis that can be performed upon this data, such as plotting the data on a map. Similarly, it might be important for the data model to define that a text data type represents a web or image Uniform Resource Locator (URL). Typical data categories include such things as address, city, state, province, continent, country, region, place, county, longitude, latitude, postal code, web URL, image URL, and barcode.

Analysis

Once a domain has been selected and data sources have been combined into a model, the next step is to perform an analysis of the data. This is a key process within business intelligence as this is when you attempt to answer questions that are relevant to the business using internal and external data. Simply having data about sales is not immediately useful to a business. For example, to predict future sales revenue, it is important that such data is aggregated and analyzed. This analysis can determine the average sales for a product, the frequency of purchases, and which customers purchase more frequently than others. Such information allows better decision-making by an organization.

Data analysis can take many forms, such as grouping data, creating simple aggregations such as sums, counts, and averages, as well as creating more complex calculations, identifying trends, correlations, and forecasting. Many times, organizations have, or wish to have, Key Performance Indicators (KPIs), which are tracked by the business to help determine the organization's health or performance. KPIs might include such things as employee retention rate, net promoter score, new customer acquisitions per month, gross margin, and Earnings Before Interest, Tax, Depreciation, and Amortization (EBITDA). Such KPIs generally require that the data is aggregated, has calculations performed on it, or both. These aggregations and calculations are called metrics or measures and are used to identify trends or patterns that can inform business decision-making. In some cases, advanced analysis tools such as programming languages, machine learning and artificial intelligence, data mining, streaming analytics, and unstructured analytics are necessary to gain the proper insights.

There are numerous programming languages that have either been specifically designed from the ground up for data analytics or have developed robust data analytics packages or extensions. Two of the most popular languages in this space include R and Python. Other popular languages include SQL, Multidimensional Expressions (MDX), Julia, SAS, MATLAB, Scala, and F#.

There is also a wide variety of machine learning and data mining tools and platforms for performing predictive analytics around data classification, regression, anomaly detection, clustering, and decision-making. Such systems include TensorFlow, Microsoft's Azure Machine Learning, DataRobot, Alteryx Analytics Hub, H2O.ai, KNIME, Splunk, RapidMiner, and Prevedere.

Streaming analytics becomes important when dealing with Internet of Things (IoT) data. In these situations, tools such as Striim, StreamAnalytix, TIBCO Event Processing, Apache Storm, Azure Stream Analytics, and Oracle Stream Analytics are used.

When dealing with unstructured data, tools such as Pig and Hive are popular, as well as tools such as Apache Spark and Azure Cognitive Services for vision, speech, and sentiment analysis.

Of course, any discussion around data analytics tools would be incomplete without including Microsoft Excel. Spreadsheets have long been the go-to analytics tool for business users, and the most popular spreadsheet today is Microsoft Excel. However, other spreadsheet programs, such as Google Sheets, Smartsheet, Apple Numbers, Zoho Sheet, and LibreOffice Calc, also exist.

Visualization

The final key concept in business intelligence is visualization or the actual presentation of the analysis being performed. Humans are visually oriented and thus it is advantageous to view the results of the analysis in the form of charts, reports, and dashboards. This may take the form of tables, matrices, pie charts, bar graphs, and other visual displays that help provide context and meaning to the analysis. In the same way that a picture is worth a thousand words, visualizations allow thousands, millions, or even trillions of individual data points to be presented in a concise manner that is easily consumed and understandable. Visualization allows the analyst or report author to let the data tell a story. This story answers the questions that are originally posed by the business and thus delivers the insights that allow organizations to make better decisions.

Individual charts or visualizations typically display aggregations, KPIs, and/or other calculations of underlying data that's been summarized by some form of grouping. These charts are designed to present a specific facet or metric of the data within a specific context. For example, one chart may display the number of web sessions by the day of the week, while another chart may display the number of page views by browser.

Business intelligence tools allow multiple individual tables and charts to be combined on a single page or report. Modern business intelligence tools such as Power BI support interactivity between individual visualizations to further aid the discovery and analysis process. This interactivity allows the report consumer to click on portions of individual visualizations, such as bar charts, maps, and tables, in order to drill down, highlight, or filter the information presented or determine the influence of a particular portion of a chart on the rest of the visualizations in a report. This goes beyond typical legacy visualization tools such as SQL Server Reporting Services (SSRS) or Crystal Reports, which only provide minimal user interactivity when it comes to choosing from predefined filters. For example, given the two charts we referenced previously, the report consumer can click on a particular day of the week in the first report to display the page visit breakdown per browser for the chosen day of the week in the second chart:

Figure 1.1 – Two bar charts: (L) Sessions by DayOfWeek; (R) Pageviews by Browser

Figure 1.1 – Two bar charts: (L) Sessions by DayOfWeek; (R) Pageviews by Browser

Finally, dashboards provide easy-to-understand visualizations of KPIs that are important to an organization. For example, the CEO of a corporation may wish to see only certain information from sales, marketing, operations, and human resources. Each of these departments may have its own detailed reports, but the CEO only wishes to track one or two of the individual visualizations within each of those reports. Dashboards enable this functionality.

Visualization software includes venerable tools such as SSRS and Crystal Reports, as well as software such as Birst, Domo, MicroStrategy, Qlik Sense, Tableau CRM, SAS Visual Analytics, Sisense, Tableau, ThoughtSpot, and TIBCO Spotfire.

Now that we have examined the key concepts and overarching themes of business intelligence, it is time to delve a layer deeper and discover the business intelligence-enabling technologies that comprise the Power BI ecosystem.

 

Discovering the Power BI ecosystem

While Power BI is often classified as a visualization tool, the reality is that Power BI is not a single tool but rather a collection of interrelated tools and services that form a complete business intelligence ecosystem. This ecosystem spans the entire business intelligence spectrum, from data to modeling, analysis, and visualization. In addition, this ecosystem includes components that are specific not only to Power BI itself but also to other Microsoft technologies that interoperate with Power BI, as well as third-party integrations. This interoperation with other Microsoft tools and technologies as well as third parties makes Power BI a formidable business intelligence platform, whose value far exceeds that of more siloed business intelligence tools in the market.

While the Power BI ecosystem is vast and complex, this ecosystem can be broken down into the following categories:

  • Core and Power BI-specific
  • Core and non-Power BI-specific
  • Non-core and Power BI-specific
  • Natively integrated Microsoft technologies
  • Extended ecosystem

Core and Power BI-specific

Core and Power BI-specific technologies include the following:

  • Power BI Desktop is a free Windows-based application that is installed on a local desktop or laptop computer. Power BI Desktop is the primary tool used to ingest, cleanse, and transform various sources of data, combine the data into models, and then analyze and visualize the data through the creation of calculations, visualizations, and reports. Once reports have been created in Power BI Desktop, these reports are often published to the Power BI service for sharing and collaboration.
  • The Power BI service is a cloud-based Software as a Service (SaaS) online platform. The Power BI service can be used for light report creation and editing, dashboard creation, as well as sharing, collaborating on, and viewing reports. Some data sources can be connected directly from the Power BI service, but the ability to model and analyze that data is limited.

Core and non-Power BI-specific

Core and non-Power BI-specific technologies include the following:

  • Power Query is the Microsoft technology that provides data connectivity and transformation. This technology allows business users to access hundreds of different sources and transform the data as required. Data sources supported by Power Query include many different file types, databases, Microsoft Azure services, and third-party services. Power Query also provides a Software Development Kit (SDK) that allows for the creation of custom connectors so that third parties can create their own data connectors that seamlessly interoperate with Power Query. Power Query is used within Power BI Desktop, Microsoft Excel, Microsoft SQL Server Data Tools for Visual Studio, and Microsoft Dataverse (formerly Common Data Service).
  • Data Analysis Expressions (DAX) is a language that consists of a collection of functions, operators, and constants that can be used to write formulas, or expressions, that return calculated values. Similar to Excel functions or MDX, DAX helps you create new information from data that's already in your model.
  • The data gateway is software that is installed to facilitate access from the Power BI service to on-premises data sources. The data gateway allows the Power BI service to refresh data from one or more data sources housed within on-premises systems. The data gateway comes in two modes – personal and enterprise. Personal mode can only be used with Power BI, while enterprise mode can be used with Power BI as well as Power Automate, Microsoft Power Apps, Azure Analysis Services, and other Azure logic apps.
  • Analysis Services tabular modeling is an evolution of Microsoft's multidimensional cubes. This technology is available outside of Power BI within Analysis Services but is also fundamental to Power BI. Models that are built within Power BI are actually built using SSAS Tabular, and Power BI Desktop runs a full instance of SSAS Tabular under the hood, so to speak. Thus, when building models in Power BI Desktop, you are actually building an Analysis Services tabular model.
  • Microsoft AppSource, or simply AppSource, is a marketplace for finding apps, add-ons, and extensions to Microsoft software, including Office 365, Azure, Dynamics 365, Cortana, and Power BI. Within Power BI, AppSource hosts custom visuals that can be downloaded and added to Power BI reports that have been authored within the desktop. These visuals are supported within the Power BI service as well.

Non-core and Power BI-specific

Non-core and Power BI-specific technologies include the following:

  • Power BI Report Server is an on-premises technology that is a superset of SSRS. Similar in function to the Power BI service, Power BI Report Server allows Power BI reports authored in Power BI Desktop and Excel to be published and viewed while remaining fully on-premises. Because Power BI Report Server is a superset of SSRS, it can also host paginated reports (.rdl).
  • Power BI Embedded is a system of REST APIs that can be used to display visualizations, reports, and dashboards within custom applications that serve customers that are external to an organization. Power BI Embedded is often used by Independent Software Vendors (ISVs) and developers.
  • Power BI mobile applications are native Android, iOS, and Windows applications that are downloadable from the respective platform stores, Google Play, the Apple App Store, and the Microsoft Store. Power BI mobile apps are touch-optimized for viewing and interacting with Power BI reports that are published to the Power BI service.
  • Power BI for mixed reality is based on the Power BI Windows mobile app but has additional capabilities that have been designed to work with Microsoft HoloLens. The HoloLens capabilities allow data to be projected holographically within the mixed-reality environment, and also have the ability to view and interact with reports and dashboards published within the Power BI service.

Natively integrated Microsoft technologies

Natively integrated Microsoft technologies include the following:

  • Office 365 is Microsoft's ubiquitous line of subscription services, which includes traditional Office applications, plus other productivity services that are enabled via the cloud (the internet). Central to Office 365 is the concept of a tenant, an organization's very own slice of Office 365. Power BI integrates natively with Office 365 so that when a user subscribes to Power BI, the email address is checked for existing Office 365 tenants, and if one exists, the Power BI user will be added to that tenant. If an Office 365 tenant does not exist, Power BI will provision a new Office 365 tenant, sometimes called a shadow tenant, and the Power BI user will be added to that tenant.
  • Excel incorporates many underlying Power BI technologies as native add-ons. These include Analyze in Excel and Power Pivot, which provides access to Power Query, and the same underlying data model used by Power BI (Analysis Services tabular models). Excel is also a first-class citizen within the Power BI service and is called Workbooks within the Power BI service.
  • Power BI is part of the Power Platform ecosystem, which includes Power Automate, Power Apps, and Power Virtual Agents. Power Automate is a workflow technology that has a native connector for Power BI that supports both triggers and actions. Triggers are based on the Power BI service's data alerts, and actions support both streaming and non-streaming datasets in the service. In addition, there is a Power Automate visualization available in AppSource.
  • Power Apps is a form-based technology from Microsoft that provides a native connector to Power BI, as well as a default visualization. In addition, Power BI tiles can be embedded into Power Apps applications. Finally, Power Apps uses Power Query technology as part of its data integration feature.
  • Power Virtual Agents enables the creation and use of intelligent agents or bots that can perform tasks or have intelligent, contextual conversations with humans. Power Virtual Agents chatbots can be integrated into Power BI workspaces.
  • The Microsoft desktop application Visio has a custom visual for Power BI that was built by Microsoft. This visual allows you to link data within Power BI to a Visio diagram as values or colors that are displayed within the Visio drawing.
  • SharePoint provides the ability to embed Power BI reports within SharePoint via a native Power BI report web part.
  • Dynamics 365 provides the ability to embed Power BI visuals natively within Dynamics 365 reports and dashboards. In addition, Power BI has native connectors for Dynamics. Finally, there are several apps available for Dynamics in the Power BI service.
  • Dataverse (formerly Common Data Service) is actually central to Power Automate, Power Apps, Power Virtual Agents, Dynamics 365, and Power BI. Dataverse allows an organization to store and manage data from numerous business applications within a set of standard and custom entities. Entities allow an organization to create a business-focused definition of their data and use this data within apps. Power BI has a native connector for Dataverse.
  • Azure Machine Learning technologies are becoming pervasive within Power BI. This includes the ability to create columns from an example within Power Query, as well as custom visualizations, such as the key influencers visual. In addition, dataflows in Premium capacities can leverage Automated Machine Learning and Cognitive Services.
  • One of the native outputs from Azure Stream Analytics is Power BI. This allows you to stream data that is running through Azure Stream Analytics and display it on Power BI's dashboard tiles.
  • Report Builder is a venerable Microsoft tool for the creation of paginated reports (.rdl). These reports can be published to a Power BI Premium instance and displayed within the Power BI service.

The extended Power BI ecosystem

Microsoft has created numerous APIs and SDKs that enable the creation of custom visuals, data source connectors, and automation via PowerShell and other coding languages. As a result, there is a large extended ecosystem of third-party custom visuals, connectors, apps, and add-on products for Power BI. In addition, Power BI integrates with other non-Microsoft programming languages, such as Python, R, and Scalable Vector Graphics (SVG).

The Power BI community is a large ecosystem of users focused on the education and use of Power BI within local communities. To find a Power BI user group in your area, go to https://www.pbiusergroup.com.

In addition to local user groups, there is a general community website that provides forums and galleries where Power BI users can get questions answered about using Power BI. To access this community website, go to https://community.powerbi.com.

As you can see, Power BI is part of a large ecosystem of tools and technologies that enables business intelligence as well as supporting the business processes that benefit from informed decisions. To get the most out of these tools and technologies, it is important to understand how Power BI is licensed.

 

Choosing the right Power BI license

Power BI provides numerous licensing options that provide flexibility and affordability for individuals and organizations of any size. These various licensing options come in two primary categories:

  • Shared capacity
  • Dedicated capacity

Shared capacity

Think of shared capacity like an apartment building. While each tenant within the apartment building has their own personal living quarters, certain infrastructures, such as a common entryway, electrical wiring, and plumbing, are shared. Shared capacity licensing options for Power BI work similarly. While each tenant within the shared capacity of the Power BI service has their own personal area for publishing datasets, reports, and dashboards, the memory and processing capacity of the entire Power BI service is shared among all of the tenants using the Power BI service within an Azure data center. And, just like a tenant in an apartment building playing loud music can affect that tenant's neighbors, so too can tenants impact other tenants within the same shared capacity by using resources within the Power BI service.

There are two options for using shared capacity within Power BI:

  • Free
  • Pro

Free

Power BI Desktop is free to download, install, and use. In addition to simply sharing Power BI files (.pbix), which are the files that are created by the Power BI Desktop program, Microsoft provides a free method of using the Power BI service so that you can publish and share reports via a feature called Publish to web. Publish to web creates a web browser link or URL to a report that's published in the Power BI service. While this URL is long and cryptic and these reports are not indexed by search engines, anyone with the link can browse the report anonymously. This means that there is no real security other than simple obfuscation.

It is important to note here that Publish to web is the only sharing mechanism available with a free license of Power BI. All other sharing, collaboration, and export features are not available with the Power BI service. This includes app workspaces for collaboration, sharing of reports and dashboards, as well as exporting to Microsoft PowerPoint/Comma Separated Value (CSV) and embedding within Microsoft SharePoint.

The free licensing of Power BI also comes with certain restrictions on the automatic refreshing of data. Only online sources of data can be refreshed automatically. While both online and internet-based data sources can be refreshed, these data sources can only be refreshed eight times per day, and the minimum time between refreshes is 30 minutes.

Finally, the free Power BI license restricts the total size of any single data model that's published to the Power BI service to 1 GB and the total size of all data models that are published to the service for each user to 10 GB.

Pro

The classic way to license Power BI that provides more secure sharing and collaboration is called Power BI Pro. Pro is a subscription license for authoring and consuming shared reports within the Power BI service and is suitable for individuals as well as small and medium businesses with fewer than approximately 500 active users. Each user who authors or consumes a shared report must purchase Power BI Pro for a commercial list price of $9.99 per user per month.

Pro-licensing Power BI enables all sharing, collaboration, and additional features that are not available with a free license. However, Pro has the same restrictions in terms of dataset size and total capacity as Power BI free licenses. This means that the total size of any single data model that's published to the Power BI service is restricted to 1 GB and the total size of all data models that are published to the service for each user cannot exceed 10 GB. In addition, the data refresh frequency limitations are the same as free licenses.

Dedicated capacity

While shared capacity is analogous to an apartment building, think of dedicated capacity as individual houses. With dedicated capacity licensing options, each tenant has dedicated memory and processing power that only that tenant can utilize, thereby protecting each tenant from noisy neighbors.

There are four options for using dedicated capacity within Power BI:

  • Embedded
  • Premium
  • Premium Per User (PPU)
  • Report Server

Embedded

Power BI Embedded is a capacity-based subscription license. This means that increments of capacity in the form of virtual CPU cores and memory are purchased on an hourly basis. These increments of capacity are called node types and range in size from an A1 with a single virtual core and 3 GB of RAM for approximately $750 per month to an A6 with 32 virtual cores and 100 GB of RAM for approximately $24,000 per month. Power BI Embedded A SKUs are purchased via the Azure portal. In addition to this capacity licensing, report authors also need to license Power BI Pro for the standard license cost of $9.99 per user per month.

Power BI Embedded is intended for use by developers and ISVs that use APIs to embed Power BI visuals, reports, and dashboards within their custom web applications. These applications can then be accessed by external customers. While Power BI's embedding technology can be used to create applications that are accessed by internal users, different SKUs are required, Power BI Premium EM SKUs.

Finally, it is worth noting that A1 and A2 nodes run on non-dedicated capacity, while A3 to A6 nodes run on dedicated capacity. This is important, as dedicated capacity can prevent noisy neighbor scenarios that may impact overall performance.

Premium

Power BI Premium is a capacity-based subscription license, similar in concept to Power BI Embedded, intended for medium-to-large enterprise customers. Like Power BI Embedded, this means that increments of capacity in the form of virtual CPU cores and memory are purchased and can only be used by the purchaser. Unlike Power BI Embedded, these capacities are purchased on a monthly – not an hourly – basis. These increments of capacity are also called node types and range in size from a P1 with 8 virtual cores and 25 GB of RAM for approximately $5,000 per month to a P5 with 128 virtual cores and 400 GB of RAM for approximately $80,000 per month. In addition to this capacity licensing, report authors also need to license Power BI Pro for the standard license cost of $9.99 per user per month.

As opposed to Power BI Embedded, all of Power BI Premium's SKUs provide dedicated capacity and are intended to serve both internal and external users. Power BI Premium also increases a single dataset size to 10 GB and supports up to 48 refreshes per day, with a minimum time between refreshes of 1 minute. Finally, Power BI Premium unlocks enterprise features, such as the ability to publish paginated reports, goals, deployment pipelines, paginated reports (.rdl), advanced artificial intelligence capabilities, XMLA endpoints, autoscale, and Multi-Geo support.

Power BI Premium's SKUs also provide a license for Power BI Report Server. The same number of virtual cores that are provisioned for Power BI Premium in the cloud can also be used for Power BI Report Server, although you should check the latest documentation from Microsoft since the exact licensing model may change.

Premium Per User

Microsoft released Power BI Premium Per User licensing to fill the gap for small and medium customers that required the enterprise features included with Premium but could not afford the entry fee of $60,000 per year. Premium Per User works exactly like Pro licensing in that each individual user accessing a Premium Per User workspace must have a Premium Per User license. However, workspaces designated as Premium Per User include almost all of the advanced enterprise features of Premium except for a small number of features, such as Multi-Geo support and Report Server licensing. Premium Per User licenses cost $20 per user per month.

Report Server

Power BI Report Server is intended for customers who wish to keep their data completely on-premises. Power BI Report Server is a superset of SSRS and, therefore, can be licensed by purchasing a license of SQL Server Enterprise Edition with Software Assurance (SA) or by purchasing any Power BI Premium SKU. The details of SQL licensing are beyond the scope of this book, but typical costs will range in price from $4,000 to $7,000 per core. In addition, report authors also need to license Power BI Pro for the standard license cost of $9.99 per user per month.

Importantly, Power BI Report Server does not provide the same level of functionality as the Power BI service. For example, Power BI Report Server does not allow reports to be created or edited like in the Power BI service. In addition, features such as workspace apps, dashboards, real-time streaming, Q&A, quick insights, and R visuals are not supported within Power BI Report Server. However, Power BI Report Server can access and refresh online and on-premises data sources without requiring a gateway.

Next, let's get acquainted with the predominant components of Power BI.

 

Introducing Power BI Desktop and the Power BI service

After learning about the entire breadth and depth of the Power BI ecosystem and licensing, you may be wondering how on earth a single book can possibly cover everything there is to know about Power BI. The short answer is that it is not possible. Entire books exist dedicated to just a single topic, including Power Query, DAX, and Power BI Embedded, alone. Even books that claim to be complete references to Power BI inevitably leave out some components of the entire Power BI ecosystem. Thus, this book is intended for those who wish to learn the basic, core components of the Power BI ecosystem, namely the following:

  • Power BI Desktop, including Power Query and DAX
  • The Power BI service, including the data gateway

Learning about these core technologies and, while doing so, learning how to build and share reports with meaningful business intelligence insights means that by the end of this book, you will have learned about all of the core components and become a participant within the Power BI ecosystem.

Power BI Desktop

As mentioned previously, Power BI Desktop, or simply Desktop, is a free Windows-based application that can be installed on a local desktop or laptop computer. Desktop is the primary tool used by analysts to ingest, shape, analyze, and visualize data. Power BI Desktop is useful for the following tasks:

  • Getting data
  • Creating a data model
  • Analyzing data
  • Creating and publishing reports

Getting data

The first step in working with Power BI Desktop is to connect to data sources. There are currently over 100 connectors that can be used to connect to different data sources, including many general-purpose connectors, such as the web connector, OData Feed, and the JSON connector, which enable connections to hundreds, if not thousands, of different sources of data.

Creating a data model

Connecting to a data source creates a query within a tool called the Query Editor. The Query Editor utilizes Power Query technology and provides a graphical interface that allows the user to create a series of steps that are recorded and then replayed every time data is loaded or refreshed from the source. This means your data always ends up in your desired form.

Queries load data into structured data tables within Power BI Desktop. Once these tables of data have been loaded, a data model can be constructed by relating these tables to one another.

Analyzing data

The data that's used within the model does not have to come solely from data sources. Power BI uses a technology called DAX, which allows users to create calculations in the form of calculated columns, measures, and even entire tables. This allows analysts to create simple measures, such as gross margins and percentage of totals, as well as more complex measures, such as year-over-year revenue.

Creating and publishing reports

Once a data model has been built and analyzed, visuals can be created on report pages by dragging and dropping fields onto the report canvas. Visuals are graphical representations of the data within the model. There are 37 default visuals within Power BI Desktop, but hundreds more can be imported from Microsoft AppSource and used within Power BI Desktop. Multiple visuals can be combined on one or more pages to create a report. These visuals and pages can interact with one another as users click within the report. Once the reports have been finalized, the reports can be published to the Power BI service. These reports can then be shared with other users.

The Power BI service

The Power BI service is a hosted web application that runs on Microsoft's cloud platform, Azure. The service can be accessed at https://app.powerbi.com, but do not go there just yet! The service is also free to use, with restrictions around the sharing of reports as covered previously. The service is useful for the following tasks:

  • Viewing and editing reports
  • Creating dashboards
  • Sharing and collaborating with others
  • Accessing and creating apps
  • Refreshing data

Viewing and editing reports

Reports that are published to the service can be viewed within a web browser. This provides the same interactive experience as Desktop. In addition, reports can be marked as favorites, subscribed to for email delivery, and downloaded as PDFs or PowerPoint files. Existing reports can be edited, and new reports can be created from the published datasets.

Creating dashboards

The Power BI service allows visuals from one or more reports to be combined into dashboards. Dashboards highlight the most important information and metrics. Dashboards can be marked as favorites, subscribed to for email delivery, and have alerts created that notify the user when thresholds have been exceeded.

Sharing and collaborating with others

Reports and dashboards can be shared with others. This allows a report author's published work to be easily distributed to a wider audience. Other users can set their own personal report bookmarks, as well as engaging in discussions about dashboards. Multiple individuals can even collaborate together within workspaces to create reports and dashboards. The security settings allow authors and collaborators to control exactly who sees precisely which dashboards, reports, and data.

Accessing and creating apps

Microsoft and third parties have created built-in apps within the service that provide bundles of data, reports, and dashboards. These apps can be subscribed to from within the service and added to your personal Power BI workspace. Authors can even bundle their own data, reports, and dashboards into an app and distribute it to other users in their organization.

Refreshing data

The service allows users to schedule automatic refreshes of online and on-premises data. This means that once a refresh has been configured for a published report, the data is always current, with no further work required by the author. The service provides its own refresh gateway for online data. On-premises data requires an on-premises data gateway to be installed on a local network.

Let's have a quick summary of what we've learned in this chapter.

 

Summary

In this chapter, we were introduced to business intelligence and its key concepts. Then, we took a broad look at the Power BI ecosystem. Finally, we explored some more specific capabilities of Power BI Desktop and the Power BI service. Because Power BI is all about business intelligence, it is difficult, if not impossible, to understand Power BI without first understanding the broader context of making informed business decisions. Understanding the full scope of business intelligence and the Power BI ecosystem will help you understand the context in which the Power BI technologies covered in this book are applied.

In the next chapter, we will dig deeper into the methodology and process for Power BI projects within corporations and set up the example business intelligence project used throughout the rest of the book.

 

Questions

As an activity, try to answer the following questions on your own:

  • What is business intelligence?
  • What does business intelligence do for an organization? What are the five key concepts of business intelligence? What are the five different types of data?
  • Which word best describes everything that comprises and integrates with Power BI?
  • Which tools are covered in this book? What can you do with Power BI?
 

Further reading

To learn more about the topics that were covered in this chapter, please take a look at the following references:

About the Author
  • Greg Deckler

    Greg Deckler is Vice President of the Microsoft Practice at Fusion Alliance and has been a professional technology systems consultant for over 25 years. Internationally recognized as an expert in Power BI, Greg Deckler is a Microsoft MVP for Data Platform and a superuser within the Power BI community with over 100,000 messages read, more than 11,000 replies, over 2,300 answers, and more than 75 entries in the Quick Measures Gallery. Greg founded the Columbus Azure ML and Power BI User Group (CAMLPUG) and presents at numerous conferences and events, including SQL Saturday, DogFood, and the Dynamic Communities User Group/Power Platform Summit.

    Browse publications by this author
Latest Reviews (1 reviews total)
Great book for learning 👍
Learn Power BI - Second Edition
Unlock this book and the full library FREE for 7 days
Start now