Your message has been sent.
This article has been saved to your account.
Go to my account
This article has been emailed to your Kindle.
Send this article
In this article by Yuli Vasiliev, author of Oracle Business Intelligence: The Condensed Guide to Analysis and Reporting, you'll look at the basic concepts behind Business Intelligence. Proceeding with the discussion on data and information, it then moves on to describe what business questions you might need to answer, and how to find those answers from the data available at your disposal.
Listed as short bullets, here are the main topics of the article:
- Basic introduction to data, information, and Business Intelligence
- Answering basic business questions
- Answering probing analytical questions
- Asking business questions using data access tool
- Deriving information from existing data
- Accessing transactional and dimensional data
|Read more about this book|
(For more resources on Oracle, see here.)
Most businesses today use Business Intelligence (BI), the process of obtaining business information from available data, to control their affairs. If you're new to Business Intelligence, then this definition may leave you with the following questions:
- What is data?
- What is the information obtained from it?
- What is the difference between data and the information obtained from it?
You may be confused even more if you learn that data represents groups of information related to an object or a set of objects. Depending on your needs, though, such groups of information may or may not be immediately useful, and often require additional processing such as filtering, formatting, and/or calculating to take on a meaning.
For example, information about your customers may be organized in a way that is stored in several database tables related to each other. For security purposes, some pieces of information stored in this way may be encoded, or just represented in binary, and therefore not immediately readable. It's fairly obvious that some processing must be applied before you can make use of such information.
So, data can be thought of as the lowest level of abstraction from which meaningful information is derived. But what is information anyway? Well, a piece of information normally represents an answer to a certain question. For example, you want to know how many new customers have registered on your site this year. An answer to this question can be obtained with a certain query issued against the table containing customer registration dates, giving you the information you asked for.
Data, information, and Business Intelligence
Although the terms data and information refer to similar things, they aren't really interchangeable as there is some difference in their meaning and spirit. Talking about data, as a rule, involves its structure, format, storage, as well as ways in which you can access and manipulate it. In contrast, when talking about information, you mean food for your decision-making process. So, data can be viewed as low-level information structures, where the internal representation matters. Therefore, the ways in which you can extract useful information from data entirely depend on the structure and storage of that data.
The following diagram gives a conceptual view of delivering information from different data sets:
As you can see from the figure, information can be derived from different data sources, and by different means. Once it's derived, though, it doesn't matter where it has come from, letting its consumers concentrate on the business aspects rather than on the specifics of the internal structure. For example, you might derive some pieces of data from the Web, using the Oracle Database's XQuery feature, and then process it as native database data.
To produce meaningful information from your data, you will most likely need to perform several processing steps, load new data, and summarize the data. This is why the Business Intelligence layer usually sits on top of many data sources, consolidating information from various business systems and heterogeneous platforms.
The following figure gives a graphical depiction of a Business Intelligence system. In particular, it shows you that the Business Intelligence layer consumes information derived from various sources and heterogeneous platforms.
It is intuitively clear that the ability to solve problems is greatly enhanced if you can effectively handle all the information you're getting. On the other hand, extracting information from data coming in from different sources may become a nightmare if you try to do it on your own, with only the help of miscellaneous tools. Business Intelligence comes to the rescue here, ensuring that the extraction, transformation, and consolidation of data from disparate sources becomes totally transparent to you.
For example, when using a Business Intelligence application for reporting, you may never figure out exactly what happens behind the scenes when you instruct the system to prepare another report. The information you need for such a report may be collected from many different sources, hiding the complexities associated with handling heterogeneous data. But, without Business Intelligence, that would be a whole different story, of course. Imagine for a moment that you have to issue several queries against different systems, using different tools, and you then have to consolidate the results somehow—all just to answer a single business question such as: what are the top three customers for the preceding quarter?
As you have no doubt realized, the software at the Business Intelligence layer is used to provide a business-centric view of data, eliminating as much of the technology-specific logic as possible. What this means in practice is that information consumers working at the Business Intelligence layer may not even know that, say, customer records are stored in a Lightweight Directory Access Protocol (LDAP) database, but purchase orders are kept in a relational database.
The kind of business questions you may need to answer
As you just learned, Business Intelligence is here to consolidate information from disparate sources so that you need not concern yourself with it. Okay, but why might you need to gather and process heterogeneous data? The answer is clear. You might need it in order to answer analytical questions that allow you to understand and run your business better.
In the following two sections, you'll look at some common questions that Business Intelligence can help you answer. Then, you'll see how you can ask those questions with the help of Business Intelligence tools.
Answering basic business questions
The set of questions you may need your Business Intelligence system to answer will vary depending on your business and, of course, your corresponding functions. However, to give you a taste of what Business Intelligence can do for you, let's firrst look at some questions that are commonly brought up by business users:
- What is the average salary throughout the entire organization?
- Which customers produce the most revenue?
- What is the amount of revenue each salesman brought in over the preceding quarter?
- What is the profitability of each product?
If you run your business online, you may be also interested in hit counting and traffic analysis questions, such as the following:
- How much traffic does a certain account generate over a month?
- What pages in your site are most visited?
- What are the profits made online?
Looking at the business analysis requests presented here, a set of questions related to your own business may flash into your mind.
Answering probing analytical questions
In the preceding section, you looked at some common questions a business analyst is usually interested in asking. But bowing to the reality, you may have to answer more probing questions in your decision-making process, in order to determine changes in the business and find ways to improve it. Here are some probing analytical questions you might need to find answers to:
- How do sales for this quarter compare to sales for the preceding quarter?
- What factors impact our sales?
- Which products are sold better together?
- What are ten top-selling products in this region?
- What are the factors influencing the likelihood of purchase?
As you can see, each of these questions reflects a certain business problem. Looking through the previous list, though, you might notice that some of the questions shown here can be hard to formulate with the tools available in a computer application environment.
There's nothing to be done here; computers like specific questions. Unlike humans, machines can give you exactly what you ask for, not what you actually mean. So, even an advanced Business Intelligence application will require you to be as specific as possible when it comes to putting a question to it.
It's fairly clear that the question about finding the factors impacting sales needs to be rephrased to become understandable for a Business Intelligence application. How you would rephrase it depends on the specifics of your business, of course.
Often, it's good practice to break apart a problem into simpler questions. For example, the first question on the above list—the one about comparing quarter sales—might be logically divided into the following two questions:
- What are the sales figures for this quarter?
- What are the sales figures for the last quarter?
Once you get these questions answered, you can compare the results, thus answering the original, more generically phrased question. It can also provide one definition or variation for drill down.
In the above example, it's fairly obvious what specific questions can be derived from the generic question. There may be probing questions, though, whose derived questions are not so obvious. For example, consider the following question: What motivates a customer to buy? This could perhaps be broken down into the following questions:
- Where did visitors come from?
- Which pages did they visit before reaching the product page?
Of course, the above list does not seem to be complete—some other questions might be added.
Asking business questions using data-access tools
As you might guess, although all these questions sound simple when formulated in plain English, they are more difficult to describe when using data-access tools. If you're somewhat familiar with SQL, you might notice that most of the analytical questions discussed here cannot be easily expressed with the help of SQL statements, even if the underlying data is relational.
For example, the problem of finding the top three salespersons for a year may require you to write a multi-line SQL request including several sub-queries. Here is what such a query might look like:
SELECT emp.ename salesperson, top_emp_orders.sales sales
(SELECT all_orders.sales_empno empno, all_orders.total_sales
(SELECT sales_empno, SUM(ord_total) total_sales, RANK() OVER
(ORDER BY SUM(ord_total) DESC) sal_rank
WHERE EXTRACT(YEAR FROM ord_dt) = 2009
GROUP BY sales_empno
)top_emp_orders, employees emp
WHERE top_emp_orders.empno = emp.empno
ORDER BY sales DESC;
This might produce something like this:
If you're not an SQL guru of course, writing the above query and then debugging it could easily take a couple of hours. Determining profitability by customer, for example, might take you another couple of hours to write a proper SQL query. In other words, business questions are often somewhat tricky (if possible at all) to implement with SQL.
All this does not mean that SQL is not used in the area of Business Intelligence. Quite the contrary, SQL is still indispensable here. In fact, SQL has a lot to offer when it comes to data analysis. As you just saw, though, composing complex queries assumes solid SQL skills. Thankfully, most Business Intelligence tools use SQL behind the scenes totally transparently to users.
Now let's look at a simple example illustrating how you can get an analytical question answered with a Business Intelligence tool—Oracle BI Discoverer Plus in this particular example. Suppose you simply want to calculate the average salary sum over the organization. This example could use the records from the hr.employees demonstration table. Creating a worksheet representing the records of a database table in the Discoverer Plus focuses on issues related to analyzing data, and creating reports with the tools available through the Oracle Business Intelligence suite. For now, look at the following screenshot to see what such a worksheet might look like:
As you can see in the previous screenshot, a Discoverer Plus worksheet is similar to one in MS Excel. As in Excel, there are toolbars and menus offering a lot of options for manipulating and analyzing data presented on the worksheet. In addition, Discoverer Plus offers Item Navigator, which enables you to add data to (or remove it from) the worksheet. The data structure you can see in Item Navigator is retrieved from the database.
When we return to our example, answering the question: "what is the average salary across the organization?"Similarly, in Excel, it is as simple as selecting the Salary SUM column on the worksheet, choosing an appropriate menu, and setting some parameters in the dialog shown next. After you click the OK button in this dialog box, the calculated average will be added to the worksheet in the position specified. So, the Total dialog shown in the following screenshot provides an efficient means for automating the process of creating a total on a specified data column:
As you can see, this approach doesn't require you to write an SQL query on your own. Instead, Discoverer Plus will do it for you implicitly, thus allowing you to concentrate on business issues rather than data access issues. This previous example should have given you a taste of what Business Intelligence can do for you.
eBook Price: $23.99
Book Price: $39.99
|Read more about this book|
(For more resources on Oracle, see here.)
Deriving information from existing data
As you have no doubt realized, having data does not automatically mean having information. To turn data into meaningful information, analysis tools are required. Such tools can be implemented as standalone applications, or be integrated within a database or application server layer.
If you're dealing with data stored in a database, implementing data processing logic at the database tier seems to be most efficient. For example, Oracle Database offers a number of native features focusing on data analysis—the process of converting data into information. Sometimes, though, to obtain the required piece of information from your data, it's quite enough to issue a simple SQL query against it. For example, if you want to know the number of orders placed over a certain period of time, say, a year, this query might be as simple as the following:
SELECT count(*) FROM orders
WHERE EXTRACT(YEAR FROM ord_dt) = 2004
That is it. As you can see, the required piece of information is derived here from a relational table with a two-line SQL query. It's fairly obvious that you don't need any Business Intelligence tools to answer simple questions like the one in the previous example—SQL alone will be enough.
Unfortunately, not all the questions you have to face in practice can be answered so easily. Much more often, Business Intelligence has to operate on a larger dataset than one comprised by a single relational table, at times aggregating information from disparate data sources. This is where you are unlikely to get away with SQL alone—sophisticated Business Intelligence tools like those that the Oracle Business Intelligence suite includes are required.
However, you should have no illusions about the capabilities of Business Intelligence. It is important to realize that Business Intelligence is not a magic black box that can derive information from nowhere, answering probing analytical questions from sparse datasets. There will be a significant difference in the accuracy of the information extracted, depending on how dense the underlying dataset is.
Answering business questions from your data
As st ated above, the quantity and quality of the information you can obtain directly depends on the quantity and quality of the data available at your disposal. Not surprisingly, the more information you can collect, the more probing analytical questions you can answer.
To start with, let's look at a simple example. Suppose you need to analyze the results of a survey you conducted online to get some vital feedback from your customers. It turned out, however, that only one-tenth of the overall number of your customers participated. Of course, the information you will extract from the results of this survey cannot be considered comprehensive, as the opinion of the majority remains unclear and, therefore, only a sparse dataset has been analyzed.
If you consult a dictionary, you should see that 'sparse' is a synonym to 'scanty'. What this means here is that a sparse dataset has gaps. Say, you have sales figures for today and don't have them for yesterday and so on. However, for analysis purposes, you often need a dense dataset—one that contains no gaps. So densification is the process of filling gaps in a dataset.
While in the above example, a factor that complicated performing a quality analysis was that there was not enough incoming information; you will most likely in practice experience another kind of problem—composing the 'right' dataset from the sea of data available at your disposal. In terms of Oracle Discoverer, such datasets are called workbooks and represent business areas—collections of related information. Each workbook contains worksheets displaying data from the perspectives you want to view that data in, to obtain answers to your business questions. So when creating a workbook, you should include all the data structures required for building the worksheets you want.
Diagrammatically, the scenario involving Oracle Discoverer workbooks, each of which can be used to answer a certain set of business questions might look like the following diagram:
As you can see, Oracle Discoverer workbooks can be built on a variety of data sources, representing the derived data so that it can answer your business questions. The diagram in the figure gives you a rough idea of the mechanism used by Oracle Discoverer to provide a business view of the underlying data.
But you're probably wondering "How would I know which data structures must be included in a workbook to give enough information for getting those business questions answered?" Well, in most cases, the answer to this question is intuitively clear: you add to the workbook only those data structures that contain the data you want to see displayed at that point on the workbook's worksheets.
If you're working with relational data, then the rule of thumb is to include not only those tables whose data you want to see on worksheets, but also their related tables. It's not a big issue though, since the Discoverer wizard used to create a new workbook includes related tables by default. As you might guess, related tables may become required when the time comes to drill down through data to detail.
For example, when creating a workbook to be used for sales analysis, you would include at least the orders, customers, and employee tables. This is correct because you often need to have the ability to integrate customer, employee, and sales information for analysis. Now imagine that you want to create a worksheet for the report showing the sales of a particular region. What this means simply is that the regions table must also have been included.
However, this may not be necessary if you're going to include only the region_id field in your report. Being part of the customers table, this field serves as the foreign key that relates to the regions table.
In terms of Oracle Discoverer, database tables included in a workbook are folders, each of which includes items representing columns of the corresponding table. It's interesting to note, however, that folders and items are not necessarily based on relational tables and their columns respectively—other data structure options, such as LDAP directories and entries are also possible. So, the Oracle Discoverer documentation defines a folder as a collection of closely related information, and an item can be thought of as a piece of information of a certain type, within a folder.
Comparing and analyzing data
Now that you have a rough idea of how data can be organized for business analysis, it's time to move on and look at how you might use that data to your advantage, gaining a comprehensive view of your business.
As we have just seen, within a workbook consolidating a collection of related information, you can create a set of worksheets, each of which is to answer a certain business question. Continuing the discussion, this section touches briefly on issues related to comparing and analyzing data.
To view the data displayed on a worksheet from the perspective you need, Oracle Discoverer offers a number of components, including totals, percentages, exceptions and calculations. With these tools in hand, you'll be able to analyze data more quickly and easily.
Schematically, this might look like the following figure:
The above diagram illustrates that you can add totals, percentages, calculations, and filters to a worksheet through the corresponding Discoverer components. All these Discoverer tools are easy to use, and allow you to rearrange the worksheet's data or drill into it, so that you can see it from another perspective, with just a few clicks. In the Asking business questions using data-access tools section earlier in this article, you saw an example of how easy it is to calculate a total for a column on a worksheet.
Although Discoverer tools open up a great way to approach the problem of analyzing data, you may sometimes have to rearrange data even before it is included into a workbook. For example, in the area of comparative analyses, you may need to perform data densification before you can proceed to data analysis.
As usual, this can be best understood by example. Say, when performing a period-to-period comparison, you may face the problem of sparse data. For example, you perform day-to-day comparisons to see the inventory figures for various products on a daily basis. The problem is that the inventory table usually stores a row for a product when its quantity of available units changes. So, the rows stored in the inventory table might look like the following:
PROD_id EVENT_DT QUANT
------- --------- -----
111 05-FEB-10 15
111 08-FEB-10 7
222 05-FEB-10 17
222 07-FEB-10 14
333 05-FEB-10 7
333 08-FEB-10 25
As you can see, there are days when no event happened, and therefore, what you have here is a sparse dataset. In reports that you might want to create based on this table, you will most likely need this data to be represented differently, as a dense dataset. So, before you can include it to a workbook, a densification is needed.
The simplest way to solve this problem is to create a view based on the inventory table, so that the view contains a dense dataset. The densification might be accomplished with the help of the LAST_VALUE Oracle SQL analytic function, included in the select list of the view's SELECT statement.
The above example illustrated that the process of converting data into information you need, may start even before that data is chosen to be processed by a Business Intelligence tool. In other words, you may need to make some preparations to rearrange an underlying dataset so that it's ready for analysis.
The downside to the above approach is the lack of flexibility. However, Oracle Discoverer gives you a greater degree of flexibility.
If you need the inventory table's original rows, you'll need to make a change to the set of underlying data objects. However, Oracle Discoverer provides a better option. Windowing functions, such as LAST_VALUE discussed here, can be used to compute aggregates with the help of calculations, a Discoverer analysis feature mentioned at the beginning of this section. Like other Discoverer analysis features, calculations can be activated or deactivated with a single click, thus giving you a greater degree of flexibility.
Accessing transactional and dimensional data
It's interesting to note that a Business Intelligence system can work directly with transactional data, dealing with data reflecting current business operations. From the Business Intelligence's standpoint, though, not only current, but also the historical view of business operations is important.
Reporting against a transactional database
It's often the case that transactional data is stored in a relational database. Relational tables represent relational entities, such as products, orders, details, and customers, storing information about current transactions. As mentioned, a Business Intelligence solution can be built directly upon such a transactional system, containing data that you can use for analysis and reporting.
You might be asking yourself: isn't dimensional data what Business Intelligence is all about? Well, like figure skating is not only about jumps, Business Intelligence is not only about multidimensional cubes. Rather, it's about answering analytical business questions, deriving information from both relational and dimensional sources.
It's important to understand, though, that a transactional system should remain mobile and highly responsive, enabling new transactions to be processed and stored quickly. So, a common problem with such systems is that they are not designed to store large amounts of data—old data should be removed from time to time.
While data set optimization is good for a transactional system as the performance increases, it's not so good for analysis and reporting purposes. This is why Business Intelligence solutions are often built upon a data warehouse, consolidating both old and new data that can actually be stored in different sources.
Using historical data
Like a transactional system, a data warehouse represents a relational database. Unlike a transactional system however, a warehouse accumulates data, which represents the business history of an organization, and is structured for reporting, analysis, and decision support. In warehouses, data is organized around business entities such as products, regions, and customers.
Data in a warehouse can be organized so that it concentrates on a certain subject matter, say, purchasing; that is, it's optimized to simplify the task of finding answers to questions about purchasing. So, aside from historical data, a warehouse often uses summaries containing pre-processed data to speed up access to frequently queried information.
Using historical data can provide a clearer picture of the status of your business, leading to improved predictive capabilities. As you might guess, historical data is derived from a transactional system that often cannot afford to keep large amounts of data due to performance requirements. In fact, Business Intelligence can utilize both transactional and historical data sources for reporting and analysis purposes.
Diagrammatically, this might look like the following diagram:
It's interesting to note that although transactional data is not stored in a warehouse, many Business Intelligence tools access that data through a warehouse, taking advantage of summaries and other warehouse features.
Aggregating dimensional data
A multidimensional data model is often used to perform complex analysis of historical data. Multidimensional data is named so because it is organized by dimensions, such as products, times, customers, regions, and departments. Although there may be more than three dimensions in such data structures, they are often referred to as cubes.
The following diagram illustrates a cube that contains sales figures for groups of products for each month of Q4 in different regions.
What you can see in the figure is a three-dimensional data structure, enabling you to concentrate on solving a particular business problem. Looking at this structure, you might wonder what operations can be applied to it. Operations often applied to multidimensional data to extract meaningful information include the following:
The main purpose of this article was to give you a taste of things to come. After reading the article, you should now have a clear understanding of what Business Intelligence is about. In particular, you should have learned that Business Intelligence allows you to better understand the meaning of the information you're gathering, answering the questions about the current status of your business, how it got to be where it is, and where it is headed.
You looked at some basic ideas underlying Business Intelligence, including those related to answering business questions from data, and accessing transactional and historical data. Through a simple example, you learned how easy it is to get an analytical question answered with Oracle BI Discoverer Plus, a powerful Business Intelligence tool from the Oracle Business Intelligence suite.
In the last sections of the article, you not only learned about what data Business Intelligence deals with, but also how this data is structured, and what are common techniques often applied to these data structures.
- Oracle JRockit: The Definitive Guide [Book]
- Oracle WebCenter 11g: Portlets [Article]
- An overview of Oracle Hyperion Interactive Reporting [Article]
- Debugging PL/SQL in Oracle SQL Developer 2.1 [Article]
- Managing User Accounts in Oracle Siebel CRM 8 [Article]
eBook Price: $23.99
Book Price: $39.99
About the Author :
Yuli Vasiliev is a software developer, freelance author, and consultant currently specializing in open-source development, Oracle technologies, and service-oriented architecture (SOA). He has over 10 years of software development experience as well as several years of technical writing experience. He wrote a series of technical articles for Oracle Technology Network (OTN) and Oracle Magazine.