Oracle Warehouse Builder 11g R2: Getting Started 2011 — Save 50%
Extract, Transform, and Load data to build a dynamic, operational data warehouse with Oracle Warehouse Builder 11g R2 with this book and eBook
In this article by Bob Griesemer, author of Oracle Warehouse Builder 11g R2: Getting Started 2011, we'll discuss briefly what a Time dimension is, and then we'll dive right into the Warehouse Builder Design Center and create one. A Time dimension is a key part of most data warehouses. It provides the time series information to describe our data. A key feature of data warehouses is being able to analyze data from several time periods and compare results between them. The Time dimension is what provides us the means to retrieve data by time period.
|Read more about this book|
(For more resources on Oracle, see here.)
Do not be confused by the use of the word Time to refer to this dimension. In this case, it does not refer to the time of day but to time in general which can span days, weeks, months, and so on. We are using it because the Warehouse Builder uses the word Time for this type of dimension to signify a time period. So when referring to a Time dimension here, we will be talking about our time period dimension that we will be using to store the date. We will give the name Date to be clear about what information it contains.
Every dimension, whether time or not, has four characteristics that have to be defined in OWB:
- Dimension Attributes
- Level Attributes
The Levels are for defining the levels where aggregations will occur, or to which data can be summed. We must have at least two levels in our Time dimension. While reporting on data from our data warehouse, users will want to see totals summed up by certain time periods such as per day, per month, or per year. These become the levels. A multidimensional implementation includes metadata to enable aggregations automatically at those levels, if we use the OLAP feature. The relational implementation can make use of those levels in queries to sum the data. The Warehouse Builder has the following Levels available for the Time dimension when using the Time Dimension Wizard, which we'll discuss in a moment:
- Fiscal week
- Calendar week
- Fiscal month
- Calendar month
- Fiscal quarter
- Calendar quarter
- Fiscal year
- Calendar year
The Dimension Attributes are individual pieces of information we're going to store in the dimension that can be found at more than one level. Each level will have an ID that identifies that level, a start and an end date for the time period represented at that level, a time span that indicates the number of days in the period, and a description of the level.
Each level has Level Attributes associated with it that provide descriptive information about the value in that level. The dimension attributes found at that level and additional attributes specific to the level are included. For example, if we're talking about the Month level, we will find attributes that describe the value for the month such as the month of the year it represents, or the month in the calendar quarter. These would be numbers indicating which month of the year or which month of the quarter it is.
We must also define at least one Hierarchy for our Time dimension. A hierarchy is a structure in our dimension that is composed of certain levels in order; there can be one or more hierarchies in a dimension. Calendar month, calendar quarter, and calendar year can be a hierarchy. We could view our data at each of these levels, and the next level up would simply be a summation of all the lower-level data within that period. A calendar quarter sum would be the sum of all the values in the calendar month level in that quarter, and the multidimensional implementation includes the metadata to facilitate these kinds of calculations. This is one of the strengths of a multidimensional implementation.
The good news is that the Warehouse Builder contains a wizard that will do all the work for us—create our Time dimension and define the above four characteristics - just by asking us a few questions.
We could use the regular dimension wizard to create a dimension to use as a time dimension and could define our own levels but in our case, it's much simpler to make use of the special Time Dimension Wizard to create it for us.
Creating a Time dimension with the Time Dimension Wizard
Let's start creating our Time dimension by launching Design Center if it's not already running. In the Project Navigator window, we're going to expand the Databases node under ACME_DW_PROJECT, and then our ACME data warehouse node ACME_DWH. We will right-click on the Dimensions node, and select New to display a dialog that will show the list of options for creating a new dimension. An example of that screen is shown next since it is a new feature of this release:
We'll select the Time Dimension and click OK to launch the Create Time Dimension Wizard.
The New Gallery window that we just saw is actually available from the pop-up menu on any of the nodes by selecting the New... menu entry. Most of the nodes, such as tables and dimensions, also contain a New menu entry (New Table, New Dimension, and so on.) that launches right into the wizard as if the New Gallery were displayed and the first option selected. Up until now we've been able to use that default New option to create our objects. In the previous release of the Warehouse Builder, if there were multiple options for the New menu option they appeared as sub-menus. This release combines those options into a list in this New Gallery window.
The Time Dimension Wizard will walk us through a six-step process to define the characteristics of our Time dimension. The first screen will describe these steps for us, which is shown here so we can see what it will be asking us:
- The first step of the wizard will ask us for a name for our Time dimension. We're going to call it DATE_DIM. If we try to use just DATE, it will give us an error message because that is a reserved word in the Oracle Database; so it won't let us use it.
- The next step will ask us what type of storage to use for our new dimension, shown as follows:
Here we get to designate whether we want a relational physical implementation in the database or a multidimensional implementation. This is what was referred as checking a box to switch between the two. Simply select one or the other, and this is how our design will be implemented in the database with no changes by us required at all.
New in release 11gR2 is the support for cube materialized views in the database for the ROLAP option so there are actually two ROLAP options to choose from but either one results in a relational implementation in the database. Oracle OLAP 11.1 introduced the concept of materialized views for cubes for query performance improvement and this new release of the Warehouse Builder now supports the ability to use them. See the Oracle OLAP Users Guide for more information on using cube materialized views at the following URL: http://download.oracle.com/docs/cd/E11882_01/olap.112/e17123/toc.htm.
We're going to implement our data warehouse using the pure relational option and not implement any OLAP features so we're going to select ROLAP: Relational Storage, as shown in the image above and will not select the materialized view option. Both the pure relational implementation and the relational OLAP option are available by selecting the ROLAP option here. We can set a deployment configuration option that defaults to deploying data objects only. But this can be changed to deploy the OLAP metadata to the OLAP catalog also. In both cases, this will result in the generation of relational database objects in a star schema. However, if that option is selected, it will only store the OLAP metadata in the OLAP catalog in the database. We'll see where to set that option when we look at the data object editor and configuring a dimension.
- Now this brings us to step 3, which asks us to specify the data generation information for our dimension. The Time Dimension Wizard will be automatically creating a mapping for us to populate our Time dimension and will use this information to load data into it. It asks us what year we want to start with, and then how many total years to include starting with that year. The numbers entered here will be determined by what range of dates we expect to load the data for, which will depend on how much historical data we will have available to us. We have checked with the DBAs for ACME Toys and Gizmos Company to get an idea of how many years' worth of data they have and have found out that there is data for 2007, 2008, and 2009 available to us. Based on this information, we're going to set the start year to 2007 with the number of years set to three to bring us up to 2009.
The other option available to us on the data generation step is the type of Time dimension to create. It can be based on a calendar year or fiscal year. This provides us with the flexibility to define our Time dimension based on what our company actually uses for its financial year. ACME Toys and Gizmos Company operates on a calendar-year basis, so we'll leave it set at calendar.
eBook Price: $32.99
Book Price: $54.99
|Read more about this book|
(For more resources on Oracle, see here.)
- This step is where we choose the hierarchy and levels for our Time dimension. We have to select one of the two hierarchies. We can use the Normal Hierarchy of day, month, quarter, and year; or we can choose the Week Hierarchy, which consists of two levels only—the day and the calendar week. Notice that if we choose the Week Hierarchy, we won't be able to view data by month, quarter, or year as these levels are not available to us. This is seen in the following image:
The levels are not available to us because a week does not roll up or aggregate to a month. Some months have four weeks while some have five, and that's not even exact weeks. The only month that has a month evenly divided by weeks is February, and that's only during non-leap years. So, we can see that weeks do not sum up nicely into months, or any higher level of time. How about a year? Surely, that must sum up nicely we might say, as aren't there 52 weeks in a year? Multiply 52 by 7 and we get 364 days. So, even that won't work. Thus, if we choose to model weeks as one of our levels, we get day and week and that's it.
This points out an important aspect of aggregation when deciding what our levels should be. It's very important to keep that idea of aggregation or summing in mind when choosing levels, or we will end up with data that doesn't make sense. The Time Dimension Wizard will not allow us to choose levels that don't sum up correctly because it has predefined a list of levels for us to choose from, with preset hierarchies. However, when defining any other dimension type, we'll definitely have to keep this in mind as we'll be specifying levels and hierarchies ourselves rather than choosing from the predefined ones.
We're going to select the normal hierarchy, and now we can choose which of the levels to include. It is always a good idea to include the lowest level possible in our hierarchy to provide maximum flexibility in aggregating data in this dimension. If we leave out day, then we will never be able to view our data by day, but only by month at the lowest level.
- Let's move on to step 5 where the wizard will provide us the details about what it is going to create. An example is shown in the following image, which is what you should see if you've made all the same selections as we've moved along. In the following image we can see the dimension attributes, levels, and hierarchies that will be created:
We can also see an extra item at the bottom that we haven't discussed yet, a map name, if we scroll the window down to the bottom as shown next:
The DATE_DIM_MAP map entry that we can see in the previous image is a mapping for our DATE_DIM dimension, which can be run to populate the dimension. It will be created automatically for us by the wizard.
The previous version of the Warehouse Builder used to create a sequence also which was used to populate ID fields in the Time Dimension that were used as a Surrogate Identifier. A surrogate identifier is a value that stands in (acts as a surrogate) for the actual unique identifier. The actual identifier is called a Business Identifier and contains one or more attributes that are selected to uniquely identify a dimension record. We'll see these surrogate identifiers in a moment when we create the Product and Store dimensions but for Time Dimensions, it is no longer used in this new release. In its place is an actual date field for the day so there is no need for a surrogate. That date field also acts as the Business Identifier for a Time dimension, which makes it much more intuitive to use.
- Continuing to the last step, it will display a progress bar as it performs each step and will display text in the main window indicating the step being performed. When it completes, we click on the Next button and it takes us to the final screen—the summary screen. This screen is a display of the objects it created and is similar to the previous display in step 5 of 6 that shows the pre-create settings. At this point, these objects have been created and we press the Finish button. Now we have a fully functional Time dimension for our data warehouse.
We could use a data object editor to create our Time dimension, but we would have to manually specify each attribute, level, hierarchy, and sequence to use. Then we would have to create the mapping to populate it. So we definitely saved quite a bit of time by using the wizard.
The Time Dimension Wizard does quite a bit for us. Not only does it create the Time dimension, but it also creates a couple of additional objects needed to support it. Take a look at the following image, which is what our Project Navigator looks like after running this wizard:
Besides the dimension that it created, we now have a mapping that appears under the Mappings node. This is what we will deploy and run to actually build our Time dimension. We can also see that a table was created under the Tables node. This is the physical table that will be created in the database to store the dimension data. We are designing dimensionally and implementing relationally in the database and this is the relational table used for the implementation.
In this article we created the Time dimension using the Warehouse Builder Design Center.
- An Overview of Oracle Advanced Pricing [Article]
- Creating a Cube in Oracle Warehouse Builder [Article]
- Creating Dimensions in Oracle Warehouse Builder [Article]
- Oracle Warehouse Builder: Data Warehouse Design [Article]
- An overview of Oracle Hyperion Interactive Reporting [Article]
- Oracle Warehouse Builder: Designing the Target Structure [Article]
About the Author :
Bob Griesemer has over 27 years of software and database engineering/DBA experience in both government and industry, solving database problems, designing and loading data warehouses, developing code, leading teams of developers, and satisfying customers. He has been working in various roles involving database development and administration with the Oracle Database with every release since Version 6 of the database from 1993 to the present. He has also been performing various tasks, including data warehouse design and implementation, administration, backup and recovery, development of Perl code for web-based database access, writing Java code utilizing JDBC, migrating legacy databases to Oracle, and developing Developer/2000 Oracle Forms applications. He is currently an Oracle Database Administrator Certified Associate, and is employed by the Northrop Grumman Corporation, where he is currently a Senior Database Analyst on a large data warehouse project.