Introduction to Business Intelligence and Power BI
Power BI is a powerful ecosystem of business intelligence tools and technologies from Microsoft. But what exactly is business intelligence, anyway? Simply stated, business intelligence is all about leveraging data in order to make better decisions. This can take many forms and is not necessarily restricted to just business. We use data in our personal lives to make better decisions as well. For example, if we are remodeling a bathroom, we get multiple quotes from different firms. The prices and details in these quotes are pieces of data that allow us to make an informed decision in terms of which company to choose. We may also research these firms online. This is more data that ultimately supports our decision.
In this chapter, we will explore the key fundamental concepts of business intelligence, as well as why business intelligence is important to organizations. In addition, we take a high-level tour of the Power BI ecosystem, licensing, and core tools such as the Power BI Desktop and the Power BI Service.
The following topics will be covered in this chapter:
- Key concepts of business intelligence
- The Power BI ecosystem
- Power BI licensing
- Power BI Desktop and Power BI Service
Key concepts of business intelligence
Business intelligence, in the context of organizations, revolves around making better decisions about your business. Unlike the example in the introduction, organizations are not generally concerned with bathrooms, but rather with what can make their business more effective, efficient, and profitable. The businesses that provided those quotes on bathroom remodeling need to answer questions such as the following:
- How can the business attract new customers?
- How can the business retain more customers?
- Who are the competitors and how do they compare?
- What is driving profitability?
- Where can expenses be diminished?
There are endless questions that businesses need to answer every day, and these businesses need data coupled with business intelligence tools and techniques in order to answer these questions and make effective operational and strategic decisions.
While business intelligence is a vast subject in and of itself, the key concepts of business intelligence can be broken down into five areas:
A domain is simply the context within which business intelligence is applied. Most businesses are comprised of relatively standard business functions or departments, such as the following:
- Research and development
- Human resources
Each of these business functions or departments represents a domain within which business intelligence can be used to answer questions that can assist us in making better decisions.
The domain helps in narrowing down the focus regarding which questions can be answered and what decisions need to be made. For example, within the context of sales, a business might want to know which sales personnel are performing better and which sales personnel are performing worse. Business intelligence can provide this insight as well as help determine which activities enable certain sales professionals to outperform others. This information can then be used to train and mentor sales personnel who are performing more poorly.
Within the context of marketing, a business can use business intelligence to determine which types of marketing campaigns, such as email, radio, print, TV, and the web, are most effective in attracting new customers. This then informs the business where they should spend their marketing budget.
Within the context of manufacturing, a business can use business intelligence to determine the mean time between failure (MTBF) for machines that are used in the production of goods. This information can be used by the business to determine whether preventative maintenance would be beneficial and how often such preventative maintenance should occur.
Clearly, there are endless examples of where business intelligence can make an organization more efficient, effective, and profitable. Deciding on a domain in which to employ business intelligence techniques is a key step in enabling business intelligence undertakings within organizations since the domain dictates which key questions can be answered, the possible benefits, as well as which data is required in order to answer those questions.
Once a domain has been decided upon, the next step is identifying and acquiring the data that's pertinent to that domain. This means identifying the sources of relevant data. These sources may be internal or external to an organization and may be structured, unstructured, or semi-structured in nature.
Internal and external data
Internal data is data that's generated within an organization by its business processes and operations. These business processes can generate large volumes of data that is specific to that organization's operations. This data can take the form of net revenues, sales to customers, new customer acquisitions, employee turnover, units produced, cost of raw materials, and much more time series or transactional information. This historical and current data is valuable to organizations if they wish to identify patterns and trends, as well as for forecasting and future planning. Importantly, all the relevant data to a domain and question are almost never housed within a single data source; organizations inevitably have multiple sources of relevant data.
In addition to internal data, business intelligence is most effective when internal data is combined with external data. Crucially, external data is data that is generated outside of the boundaries of an organization's operations. Such external data includes things such as the business's overall global economic performance, census information, and competitor prices. All of this data exists irrespective of any particular organization.
Each domain and question will have internal and external data that is relevant and irrelevant to answering the question at hand. However, do not be fooled into believing that simply because you have chosen manufacturing/production as the domain that other domains such as sales and marketing do not have relevant sources of data. If you are trying to forecast the required production levels, sales data in terms of pipelines can be very relevant. Similarly, external data that points toward overall economic growth may also be extremely relevant while data such as the cost of raw materials may very well be irrelevant.
Structured, unstructured, and semi-structured data
Structured data is data that conforms to a rather formal specification of tables with rows and columns. Think of a spreadsheet where you might have columns for the transaction ID, customer, units purchased, and price per unit. Each row represents a sales transaction. Structured data sources are the easiest sources for business intelligence tools to consume and analyze. These sources are most often relational databases, which include technologies such as Microsoft SQL Server, Microsoft Access, Azure Table storage, Azure SQL database, Oracle, MySQL, IBM DB2, Teradata, PostgreSQL, Informix, and Sybase. In addition, this category of data sources includes relational database standards and APIs such as Open Database Connectivity (ODBC) and Object Linking and Embedding Database (OLE DB).
Unstructured data is effectively the opposite of structured data. Unstructured data cannot be organized into simple tables with rows and columns. Such data includes things such as videos, audio, images, and text. Word processing documents, emails, social media posts, and web pages are also examples of largely unstructured data. Unstructured data sources are the most difficult types of sources for business intelligence tools to consume and analyze. This type of data is either stored as binary large objects (BLOBS) or as a file in a filesystem such as the New Technology File System (NTFS) or the Hadoop Distributed File System (HDFS).
Unstructured data also includes so-called NoSQL databases, which include data stores such as document databases, graph databases, and key-value stores. These databases are specifically designed to store unstructured data. Document databases include Microsoft Azure Cosmos DB, MongoDB, 10Gen, Cloudant (IBM), Couchbase, and MarkLogic. Graph databases include Neo4j and HyperGraphDB. Key-value stores include Microsoft's Cosmos DB, Basho Technologies' Riak, Redis, Aerospike, Amazon Web Services' DynamoDB, Basho Technologies, Couchbase, Datastax's Cassandra, MapR Technologies, and Oracle. Finally, wide-column stores include Cassandra and HBase.
Semi-structured data also includes data access protocols, such as the Open Data Protocol (OData) and other Representational State Transfer (REST) APIs. These protocols provide interfaces to data sources such as Microsoft SharePoint, Microsoft Exchange, Microsoft Active Directory, and Microsoft Dynamics; social media systems such as Twitter and Facebook; as well as other online systems such as MailChimp, Salesforce, Smartsheet, Twilio, Google Analytics, and GitHub, to name a few. These data protocols abstract how the data is stored, whether that is a relational database, NoSQL database, or simply a bunch of files.
The vast majority of business intelligence tools, such as Power BI, are optimized for handling structured and semi-structured data. Structured data sources integrate natively with how business intelligence tools are designed. In addition, business intelligence tools are designed to ingest semi-structured data sources and transform them into structured data. Unstructured data is more difficult but not impossible to analyze with business intelligence tools. In fact, Power BI has a number of features that are designed to ease the ingestion and analysis of unstructured data sources. However, analyzing such unstructured data has its limitations.
A model, or data model, refers to the way in which one or more data sources are organized in order to support analysis and visualization. Models are built by transforming and cleansing data, helping to define the types of data within those sources, as well as the definition of data categories for specific data types.
Models can be extremely simple, such as a single table with columns and rows. However, business intelligence almost always involves multiple tables of data, and most often involves multiple tables of data coming from multiple sources. Thus, the model becomes more complex as the various sources and tables of data must be combined into a cohesive whole. This is done by defining how each of the disparate sources of data relates to one another. As an example, let's say you have one data source that represents a customer's name, contact information, and perhaps size in revenue and/or the number of employees. This information might come from an organization's customer relationship management (CRM) system. The second source of data might be order information, which includes the customer's name, units purchased, and the price that was paid. This second source of data comes from the organization's enterprise resource planning (ERP) system. These two sources of data can be related to one another based on the name of the customer.
Some sources of data have prebuilt models. This includes traditional data warehouse technologies for structured data as well as analogous systems for performing analytics over unstructured data. The traditional data warehouse technology is generally built upon the online analytical processing (OLAP) technology and includes systems such as Microsoft's SQL Server Analysis Services (SSAS), Azure Analysis Services, Snowflake, Oracle's Essbase, AtScale cubes, SAP HANA and Business Warehouse servers, and Azure SQL Data Warehouse. With respect to unstructured data analysis, technologies such as Apache Spark, Databricks, and Azure Data Lake Storage are used.
Transforming and cleansing
When building a data model, it is often (read always) necessary to clean and transform the source data. Data is never clean—it must always be massaged in order for bad data to be removed or resolved. For example, when dealing with customer data from a CRM system, it is not uncommon to have the same customer entered into the system with multiple spellings. The format of data in spreadsheets may make data entry easy for humans but can be unsuitable for business intelligence purposes. In addition, data may have errors, missing data, inconsistent formatting, or even have something as seemingly simple as trailing spaces. All of these types of situations can cause problems when performing business intelligence analysis. Luckily, business intelligence tools such as Power BI provide mechanisms for cleansing and reshaping the data to support analysis. This might involve replacing or removing errors in the data, pivoting, unpivoting, or transposing rows and columns, removing trailing spaces, or other types of transformation operations.
Transforming and cleansing technologies are often referred to as extract, transform, load (ETL) tools and include products such as Microsoft's SQL Server Integration Services (SSIS), Azure Data Factory, Alteryx, Informatica, Dell Boomi, Salesforce's Mulesoft, Skyvia, IBM's Infosphere Information Server, Oracle Data Integrator, Talend, Pentaho Data Integration, SAS's Data Integration Studio, Sybase ETL, and QlikView Expressor.
Defining and categorizing
Data models also formally define the types of data within each table. Data types generally include formats such as text, decimal number, whole number, percentage, date, time, date and time, duration, true/false, and binary. The definition of these data types is important as this defines what kind of analysis can be performed on the data. For example, it does not make sense to create a sum or average of text data types; instead, you would use aggregations such as count, first, or last.
Finally, data models also define the data category of data types. While a data type such as a postal code might be numeric, it is important for the model to define that the numeric data represents a United States postal code. This further defines the type of analysis that can be performed upon this data, such as plotting the data on a map. Similarly, it might be important for the data model to define that a text data type represents a web or image uniform resource locator (URL). Typical data categories include such things as address, city, state, province, continent, country, region, place, county, longitude, latitude, postal code, web URL, image URL, and bar code.
Once a domain has been selected and data sources have been combined into a model, the next step is to perform an analysis of the data. This is a key process within business intelligence as this is when you attempt to answer questions that are relevant to the business using internal and external data. Simply having data about sales is not immediately useful to a business. In order to predict future sales revenue, it is important that such data is aggregated and analyzed in some form. For example, analysis can determine the average sale for a product, the frequency of purchases, and which customers purchase more frequently than others. This is the information that allows for better decision-making by an organization.
Data analysis can take many forms, such as grouping data, creating simple aggregations such as sums, counts, and averages, as well as creating more complex calculations, identifying trends, correlations, and forecasting. Many times, organizations have, or wish to have, key performance indicators (KPIs), that are tracked by the business in order to help determine the organization's health or performance. KPIs might include such things as employee retention rate, net promoter score, new customer acquisitions per month, gross margin, and Earnings Before Interest, Tax, Depreciation, and Amortization (EBITDA). Such KPIs generally require that the data be aggregated, calculations performed on it, or both. These aggregations and calculations are called metrics or measures and are used to identify trends or patterns that can inform business decision-making. In some cases, advanced analysis tools such as programming languages, machine learning and artificial intelligence, data mining, streaming analytics, and unstructured analytics are necessary in order to gain the proper insights.
There are numerous programming languages that have either been specifically designed from the ground up for data analytics or have developed robust data analytics packages or extensions. Two of the most popular languages in this space include R and Python. Other popular languages include SQL, multi-dimensional expressions (MDX), Julia, SAS, MATLAB, Scala, and F#.
There is also a wide variety of machine learning and data mining tools and platforms for performing predictive analytics around data classification, regression, anomaly detection, clustering, and decision-making. Such systems include TensorFlow, Microsoft's Azure ML, DataRobot, Alteryx Analytics, H2O.ai, KNIME, Splunk, RapidMiner, and Prevedere.
Streaming analytics becomes important when dealing with Internet of Things (IoT) data. Tools such as Striim, StreamAnalytix, TIBCO Event Processing, Apache Storm, Azure Streaming Analytics, and Oracle Stream Analytics are used.
When dealing with unstructured data, tools such as PIG and HIVE are popular, as well as tools such as Apache Spark and Azure Cognitive Services for vision, speech, and sentiment analysis.
Of course, any discussion around data analytics tools would be incomplete without including Microsoft Excel. Spreadsheets have long been the go-to analytics tool for business users, and the most popular spreadsheet today is Microsoft Excel. However, other spreadsheet programs such as Google Sheets, Smartsheet, Apple Numbers, Zoho Sheet, and LibreOffice Calc also exist.
The final key concept in business intelligence is visualization, or the actual presentation of the analysis being performed. Humans are visually oriented and thus must be able to see the results of the analysis being performed in the form of charts, reports, and dashboards. This may take the form of tables, matrices, pie charts, bar graphs, and other visual displays that help provide context and meaning to the analysis. In the same way that a picture is worth a thousand words, visualizations allow thousands, millions, or even trillions of individual data points to be presented in a concise manner that is easily consumed and understandable. Visualization allows the analyst or report author to let the data tell a story. This story answers the questions that are originally posed by the business and thus delivers the insights that allow organizations to make better decisions.
Individual charts or visualizations typically display aggregations, KPIs, and/or other calculations of underlying data that's been summarized by some form of grouping. These charts are designed to present a specific facet or metric of the data within a specific context. For example, one chart may display the number of visitors to a website by country while another chart may display the number of website page visits per browser.
Business intelligence tools allow multiple individual tables and charts to be combined on a single page or report. Modern business intelligence tools such as Power BI support interactivity between individual visualizations in order to further aid in the discovery and analysis process. This interactivity allows the report consumer to click on portions of individual visualizations such as bar charts, maps, and tables in order to drill down into the information that's been presented or determine the influence between that portion of a chart and the rest of the visualizations on a report. This goes beyond typical legacy visualization tools such as SQL Server Reporting Services (SSRS) or Crystal Reports, which only provide minimal user interactivity when it comes to choosing from predefined filters. For example, given the two charts we referenced previously, the report consumer can click on a particular country in the first report to display the page visit breakdown per browser for that particular country in the second chart.
Finally, dashboards provide easy to understand visualizations of KPIs that are important to an organization. For example, the CEO of a corporation may wish to see particular information from sales, marketing, operations, and human resources. Each of these departments might have their own detailed reports, but the CEO only wishes to track one or two of the individual visualizations within each of those reports. Dashboards enable this functionality.
Visualization software includes venerable tools such as SSRS and Crystal Reports, as well as more software such as Birst, Domo, MicroStrategy, Qlik Sense, Salesforce Einstein Analytics, SAS Visual Analytics, SiSense, Tableau, ThoughtSpot, and TIBCO Spotfire.
The Power BI ecosystem
While Power BI is often classified as a visualization tool, the reality is that Power BI is not a single tool but rather a collection of interrelated tools and services that form a complete business intelligence ecosystem of its own. This ecosystem spans the entire business intelligence spectrum, from data sources to modeling, analysis, and visualization. In addition, this ecosystem includes components that are specific to Power BI itself but also to other Microsoft technologies that interoperate with Power BI, as well as third-party integrations. This interoperation with other Microsoft tools and technologies as well as third-parties makes Power BI a formidable business intelligence platform whose value far exceeds that of more siloed business intelligence tools in the market.
While the Power BI ecosystem is vast and complex, this ecosystem can be broken down into the following categories:
- Core, Power BI-specific
- Core, non-Power BI-specific
- Non-core, Power BI-specific
- Natively integrated Microsoft technologies
- Extended ecosystem
Core, Power BI-specific
The Power BI Desktop is a free, Windows-based application that's installed on a local desktop or laptop computer. The Power BI Desktop is the primary tool that's used to ingest, cleanse, and transform various sources of data, combine the data into models, and then analyze and visualize the data through the creation of calculations, visualizations, and reports. Once reports have been created in Power BI Desktop, they are often published to the Power BI Service for sharing and collaboration.
The Power BI Service is a cloud-based, software as a service (SaaS) online platform. Originally known as Power BI for Office 365, today it is often referred to as powerbi.com or simply Power BI. The Power BI Service can be used for light report editing as well as sharing, collaborating, and viewing reports. Some data sources can be connected directly from the Power BI Service, but the ability to model and analyze that data is limited.
Core, non-Power BI-specific
Power Query is the Microsoft technology that provides data connectivity and transformation. This technology allows business users to access over 100 different sources and transform the data without using code. Data sources supported by Power Query include a large number of file types, databases, Microsoft Azure services, and third-party services. Power Query also provides a software development kit (SDK) that allows for the creation of custom connectors so that third parties can create their own data connectors that seamlessly interoperate with Power Query. Power Query is used within the Power BI Desktop, Microsoft Excel, Microsoft SQL Server Data Tools for Visual Studio, and the Microsoft Common Data Service (CDS) for Apps.
Data Analysis Expressions (DAX) is a programming language that consists of a collection of functions, operators, and constants that can be used to write formulas, or expressions, that return calculated values. Similar to how the Excel Functions or SSAS MDX help you create new information from data already in your model, DAX is the Power BI equivalent.
The on-premises data gateway is software that is installed on premises in order to facilitate access from the Power BI Service to on-premises data sources. This allows the Power BI Service to refresh data from one or more data sources housed within on-premises systems. The on-premises data gateway comes in two modes—personal and enterprise. Personal mode can only be used with Power BI, while enterprise mode can be used with Power BI as well as by Microsoft Flow, Microsoft PowerApps, Azure Analysis Services, and other Azure Logic Apps.
SSAS (Tabular) is an evolution of Microsoft's multi-dimensional cubes. This technology is available outside of Power BI within SSAS and Azure Analysis Services but is also fundamental to Power BI. Models that are built within Power BI are actually built using SSAS Tabular, and the Power BI Desktop runs a full instance of SSAS Tabular under the hood, so to speak. Thus, when building models in Power BI Desktop, you are actually building an Analysis Services Tabular model.
Microsoft AppSource, or simply AppSource, is a marketplace for finding apps, add-ins, and extensions to Microsoft software, including Office 365, Azure, Dynamics 365, Cortana, and Power BI. Within Power BI, AppSource hosts custom visuals that can be downloaded and added to Power BI reports that have been authored within the Desktop. These visuals are supported within the Power BI Service as well but cannot be used directly from within the Service.
Non-core, Power BI-specific
Power BI Report Server is an on-premises technology that is a superset of SSRS. Similar in function to the Power BI Service, Power BI Report Server allows Power BI reports authored in Power BI Desktop and Excel to be published and viewed while remaining fully on premises. Because Power BI Report Server is a superset of SSRS, Power BI Report Server can also host paginated reports (.rdl).
Power BI Embedded is a system of REST APIs that can be used to display visualizations, reports, and dashboards within custom applications that serve customers that are external to an organization. Power BI Embedded is often used by independent software vendors (ISVs) and developers.
Power BI Mobile Applications are native Android, iOS, and Windows applications that are downloadable from the respective platform stores, Google Play, the Apple App Store, and the Microsoft Store. Power BI Mobile apps are touch-optimized for viewing and interacting with Power BI reports that are published to the Power BI Service.
Power BI for mixed reality is based on the Power BI Windows mobile app but has additional capabilities that have been designed to work with Microsoft HoloLens. The HoloLens capabilities allow data to be projected holographically within the mixed reality environment, and also have the ability to view and interact with reports and dashboards published within the Power BI Service.
Natively integrated Microsoft technologies
Office 365 is Microsoft's ubiquitous line of subscription services, which includes traditional Office applications plus other productivity services that are enabled via the cloud (the internet). Central to Office 365 is the concept of a tenant, an organization's very own slice of Office 365. Power BI integrates natively with Office 365 so that when a user subscribes to Power BI, the email address is checked for existing Office 365 tenants, and if one exists, the Power BI user will be added to that tenant. If an Office 365 tenant does not exist, Power BI will provision a new Office 365 tenant, sometimes called a shadow tenant, and the Power BI user will be added to that tenant.
Excel incorporates many underlying Power BI technologies as native add-ins. These include Power BI Publisher for publishing Excel files to the Power BI Service, Power Map for map visuals, Power Pivot, which provides access to Power Query, and the same underlying data model used by Power BI (SSAS Tabular) and Power View for additional visualizations. Excel is also a first-class citizen within the Power BI Service and is called Workbooks.
Microsoft Flow is a workflow technology that has a native connector for Power BI that supports both triggers and actions. Triggers are based on the Power BI Service's data alerts, and actions support both streaming and non-streaming datasets in the Service. Flow is actually part of the same group of applications within Microsoft, that is, the Business Applications Group.
PowerApps is a form of technology from Microsoft that provides a native connector to Power BI, as well as a custom visual. In addition, Power BI tiles can be embedded into PowerApps applications. Finally, PowerApps uses Power Query technology as part of its Data Integration feature. PowerApps is also part of the Business Application Group within Microsoft.
The Microsoft desktop application, Visio, has a custom visual for Power BI that was built by Microsoft. This visual allows you to link data within Power BI to a Visio diagram as values or colors that are displayed within the Visio drawing.
SharePoint provides the ability to embed Power BI reports within SharePoint via a native Power BI web part.
Dynamics 365 provides the ability to embed Power BI visuals natively within Dynamics 365 reports and dashboards. In addition, Power BI has native connectors for Dynamics. Finally, there are a number of apps available for Dynamics in the Power BI Service.
The CDS is actually central to PowerApps, Flow, Dynamics 365, and Power BI. The CDS allows an organization to store and manage data from all of these business applications within a set of standard and custom entities. Entities allow an organization to create a business-focused definition of their data and use this data within apps. Power BI has a native connector for the CDS.
As we mentioned earlier, the underlying technology behind Azure Analysis Services and SSAS Tabular is also part of Power BI. In addition, Power BI has native connectors for Azure Analysis Services. Finally, an instance of Azure Analysis Services is included with a Power BI Premium subscription.
Azure ML technologies are becoming pervasive within Power BI. This includes the ability to create Columns from Example within Power Query, as well as custom visualizations such as the Key Influencers visual that's available in AppSource.
Power BI Dataflows have native integration with Azure Cognitive Services. This allows Azure Cognitive Services to use Sentiment Analysis within Power BI.
One of the native outputs from Azure Streaming Analytics is Power BI. This allows you to stream data that's running through Azure Streaming Analytics and display it on Power BI's dashboard tiles.
As we mentioned earlier, Power BI Report Server is a superset of SSRS.
Report Builder is a venerable Microsoft tool for the creation of paginated reports (RDL). These reports can be published to a Power BI Premium instance and displayed within the Power BI Service.
The extended Power BI ecosystem
Microsoft has created a number of APIs and SDKs that enable the creation of custom visuals, data source connectors, and automation. As a result, there is a large extended ecosystem of third-party custom visuals, connectors, apps, and add-on products for Power BI. In addition, Power BI integrates with a number of other non-Microsoft programming languages such as Python, R, and Scalable Vector Graphics (SVG).
The Power BI community is a large ecosystem of users focused on the education and use of Power BI within local communities. To find a Power BI user group in your area, go to https://www.pbiusergroup.com.
In addition to local user groups, there is a general community website that provides forums and galleries where Power BI users can get questions answered about using Power BI. To access this community website, go to https://community.powerbi.com.
Power BI licensing
There are a number of ways to license Power BI technology, including the following:
- Power BI Free
- Power BI Pro
- Power BI Premium
- Power BI Embedded
- Power BI Report Server
Power BI Free
Power BI Desktop is free to download, install, and use. In addition to simply sharing Power BI files (.pbix), which are the files that are created by the Power BI Desktop program, Microsoft provides a free method of using Power BI Service so that you can publish and share reports via a featured called Publish to web. Publish to web creates a web browser link or URL to a report that's published in the Power BI Service. While this URL is long and cryptic and these reports are not indexed by search engines, anyone with the link can browse the report anonymously. This means that there is no real security other than simple obfuscation.
It is important to note here that Publish to web is the only sharing mechanism available with a free license of Power BI. All other sharing, collaboration, and export features are not available in the Power BI Service. This includes app workspaces for collaboration, sharing of reports and dashboards, as well as exporting to PowerPoint/CSV and embedding within Microsoft SharePoint.
The free licensing of Power BI also comes with certain restrictions on the automatic refreshing of data. Only online sources of data can be refreshed automatically. This means any data that is accessible online via the internet. On-premises data sources that require an on-premises data gateway or any live (DirectQuery) sources are not supported. Furthermore, even the online data sources can only be refreshed eight times per day, and the minimum time between refreshes is 30 minutes.
Finally, the free Power BI license restricts the total size of any single data model that's published to the Power BI Service to 1 GB and the total size of all data models that are published to the Service for each user to 10 GB.
Power BI Pro
The classic way to license Power BI that provides more secure sharing and additional features is called Power BI Pro. Pro is a subscription license for authoring and consuming shared reports within the Power BI Service. Each user who authors or consumes a shared report must purchase Power BI Pro for a commercial list price of $9.99 per user per month.
Pro licensing of Power BI enables all of the sharing, collaboration, and export features that are not enabled with a free license and also allows online and on-premises data sources to be refreshed.
Finally, Pro has the same restrictions in terms of dataset size and total capacity as Power BI free licenses. This means that the total size of any single data model that's published to the Power BI Service is restricted to 1 GB and the total size of all data models that are published to the Service for each user cannot exceed 10 GB.
Power BI Premium
Power BI Premium is a capacity-based subscription license similar in concept to Power BI Embedded. Like Power BI Embedded, this means that increments of capacity in the form of virtual CPU cores and memory are purchased. Unlike Power BI Embedded, these capacities are purchased on a monthly—not an hourly—basis. These increments of capacity are called node types and range in size from a P1 with eight virtual cores and 25 GB of RAM for approximately $5,000/month to a P5 with 128 virtual cores and 400 GB of RAM for approximately $80,000/month. In addition to this capacity licensing, report authors also need to license Power BI Pro for the standard license cost of $9.99 per user per month.
As opposed to Power BI Embedded, all of Power BI Premium's SKUs provide dedicated capacity, and are intended to serve both internal and external users. Power BI Premium also increases a single dataset size to 10 GB and supports up to 48 refreshes per day with a minimum time between refreshes of 1 minute. Finally, Power BI Premium unlocks enterprise features such as the ability to publish paginated reports, incremental refresh, and multi-geo support.
Power BI Premium's SKUs also provide a license for Power BI Report Server. The same number of virtual cores that are provisioned for Power BI Premium in the cloud can also be used for Power BI Report Server, although you should check the latest documentation from Microsoft since the exact licensing model changes over time.
Power BI Embedded
Power BI Embedded is a capacity-based subscription license. This means that increments of capacity in the form of virtual CPU cores and memory are purchased on an hourly basis. These increments of capacity are called node types and range in size from an A1 with a single virtual core and 3 GB of RAM for approximately $750/month to an A6 with 32 virtual cores and 100 GB of RAM for approximately $24,000/month. Power BI A SKUs are billed hourly; the approximate monthly prices are provided here. Power BI Embedded A SKUs are purchased via the Azure Portal. In addition to this capacity licensing, report authors also need to license Power BI Pro for the standard license cost of $9.99 per user per month.
Power BI Embedded is intended for use by developers and independent software vendors that use APIs in order to embed Power BI visuals, reports, and dashboards within their custom web applications. These applications can then be accessed by external customers. While Power BI's embedding technology can be used to create applications that are accessed by internal users, different SKUs are required, the Power BI Premium SKUs that include EM at the end.
Finally, it is worth noting that A1 and A2 nodes run on non-dedicated capacity while A3 to A6 nodes run on dedicated capacity. This is important as dedicated capacity can avoid noisy neighbor scenarios that may impact overall performance.
Power BI Report Server
Power BI Report Server is intended for customers who wish to keep their data completely on premises. Power BI Report Server is a superset of SSRS and, therefore, can be licensed by purchasing a license of SQL Server Enterprise Edition with Software Assurance (SA) or by purchasing any Power BI Premium SKU. The details of SQL licensing are beyond the scope of this book, but typical costs will range in price from $4,000 - $7,000 per core. In addition, report authors also need to license Power BI Pro for the standard license cost of $9.99 per user per month.
Importantly, the Power BI Report Server does not provide the same level of functionality as the Power BI Service. For example, Power BI Report Server does not allow reports to be created or edited like in the Power BI Service. In addition, features such as workspace apps, dashboards, real-time streaming, Q&A, quick insights, and R visuals are not supported within the Power BI Report Server. However, Power BI Report Server can access and refresh online and on-premises data sources without requiring a gateway.
Power BI Desktop and Service
After learning about the entire breadth and depth of the Power BI ecosystem and licensing, you may be wondering how on earth a single book can possibly cover everything there is to know about Power BI. The short answer is that it isn't possible. Entire books exist dedicated to just a single topic, including Power Query, DAX, and Power BI Embedded, alone. Even books that claim to be complete references to Power BI inevitably leave out some components of the entire Power BI ecosystem. Thus, this book is intended for those who wish to learn the basic, core components of the Power BI ecosystem, namely the following:
- Power BI Desktop:
- Power Query
- Power BI Service:
- On-premises data gateway
Learning about these core technologies and, while doing so, learning how to build and share reports with meaningful business intelligence insights means that by the end of this book you will have learned about all of the core components and become a participant within the Power BI ecosystem.
Power BI Desktop
As we mentioned previously, the Power BI Desktop, or simply Desktop, is a free, Windows-based application that can be installed on a local desktop or laptop computer. The Desktop is the primary tool used by analysts to ingest, shape, analyze, and visualize data. Power BI Desktop is useful for the following reasons:
- Getting data
- Creating a data model
- Analyzing data
- Creating and publishing reports
The first step in working with Power BI Desktop is to connect to data sources. There are currently over 100 connectors that can be used to connect to different data sources, including many general purpose connectors such as the web connector, OData feed, and JSON connector, which enable connections to hundreds, if not thousands, of different sources of data.
Creating a data model
Connecting to a data source creates a query within a tool called the Query Editor. The Query Editor utilizes Power Query technology and provides a graphical interface that allows the user to create a series of steps that are recorded and then replayed every time data is loaded or refreshed from the source. This means your data always ends up in your desired form.
Queries load data into structured data tables within Power BI Desktop. Once these tables of data have been loaded, a data model can be constructed by relating these tables to one another.
The data that's used within the model does not have to come solely from data sources. Power BI uses a technology called DAX, which allows users to create calculations in the form of calculated columns, measures, and even entire tables. This allows analysts to create simple measures such as gross margins and percentage of totals, as well as more complex measures such as year over year revenue.
Creating and publishing reports
Once a data model has been built and analyzed, visuals can be created on report pages by dragging and dropping fields onto the report canvas. Visuals are graphical representations of the data within the model. There are 32 default visuals within Power BI Desktop, but hundreds more can be imported from Microsoft AppSource and used within Power BI Desktop. Multiple visuals can be combined on one or more pages to create a report. These visuals and pages can interact with one another as users click within the report. Once the reports have been finalized, the reports can be published to the Power BI Service. These reports can then be shared with other users.
Power BI Service
The Power BI Service, or simply Service, is a hosted web application that runs on Microsoft's cloud platform, Azure. The Service can be accessed at https://app.powerbi.com. But don't go there just yet! The Service is also free to use, with some restrictions around the sharing of reports, data model size, and on-premises data source refreshes. The Service is useful for the following reasons:
- Viewing and editing reports
- Creating dashboards
- Sharing and collaborating with others
- Accessing and creating apps
- Refreshing data
Viewing and editing reports
Reports that are published to the Service can be viewed within a web browser. This provides the same interactive experience as the Desktop. In addition, reports can be marked as favorites, subscribed to for email delivery, and downloaded as PDFs or PowerPoint files. Existing reports can be edited, and new reports can be created from the published datasets.
The Service allows visuals from one or more reports to be combined into dashboards. Dashboards highlight the most important information and metrics. Dashboards can be marked as favorites, subscribed to for email delivery, and have alerts created that notify the user when thresholds have been exceeded.
Sharing and collaborating with others
Reports and dashboards can be shared with others. This allows a report author's published work to be easily distributed to a wider audience. Other users can set their own personal report bookmarks, as well as engage in discussions about dashboards. Multiple individuals can even collaborate together within workspaces to create reports and dashboards. The security settings allow authors and collaborators to control exactly who sees precisely which dashboards, reports, and data.
Accessing and creating apps
Microsoft and third parties have created built-in apps within the Service that provide bundles of data, reports, and dashboards. These apps can be subscribed to from within the Service and added to your personal Power BI workspace. Authors can even bundle their own data, reports, and dashboards into an app and distribute it to other users in their organization.
The Service allows users to schedule automatic refreshes of online and on-premises data. This means that once a refresh has been configured for a published report, the data is always current, with no further work required by the author. The Service provides its own refresh gateway for online data. On-premises data requires an on-premises data gateway to be installed on a local network.
In this chapter, we were introduced to business intelligence and its key concepts. Then, we took a broad look at the Power BI ecosystem. Finally, we explored some more specific capabilities of the Power BI Desktop and Power BI Service.
In the next chapter, we will get Power BI Desktop installed, explore its interface, and familiarize ourselves with some of its functionality.
As an activity, try to answer the following questions on your own:
- What is business intelligence?
- What does business intelligence do for an organization?
- What are the five key concepts of business intelligence?
- What are the five different types of data?
- Which word best describes everything that comprises and integrates with Power BI?
- Which tools are covered in this book?
- What can you do with Power BI?
- What is BI? Business intelligence strategies and solutions: https://www.cio.com/article/2439504/business-intelligence-definition-and-solutions.html
- Business intelligence: https://en.wikipedia.org/wiki/Business_intelligence
- Power BI pricing: https://powerbi.microsoft.com/en-us/pricing/
- Power BI Premium White Paper: https://go.microsoft.com/fwlink/?LinkId=874413&clcid=0x409
- What is Power BI Report Server?: https://docs.microsoft.com/en-us/power-bi/report-server/get-started
- Embedded analytics with Power BI: https://docs.microsoft.com/en-us/power-bi/developer/embedding