Chapter 12. Developing and Implementing a Simple Datamart
In this chapter you will develop a simple but complete process of loading a datamart while reviewing all concepts you learned throughout the book.
The chapter will cover the following:
Introduction to a sales datamart based on the Jigsaw puzzles database
Loading the dimensions of the sales datamart
Loading the fact table for the sales datamart
Automating what has been done
Exploring the sales datamart
In Chapter 9, you were introduced to star schemas. In short, a star schema consists of a central table known as the fact table, surrounded by dimension tables. While the fact has indicators of your business such as sales in dollars, the dimensions have descriptive information for the attributes of your business such as time, customers, and products.
A star that addresses a specific department's needs or that is built for use by a particular group of users is called a datamart. You can have datamarts focused on customer relationship management, inventory, human resources management, budget, and more. In this chapter, you will load a datamart focused on sales.
Sometimes the term datamart is confused with datawarehouse. However, datamarts and datawarehouses are not the same.
Note
The main difference between datamarts and datawarehouses is that datawarehouses address the needs of the whole organization, whereas a datamarts addresses the needs of a particular department...
As you saw, the sales star model consists of a fact surrounded by the dimension tables. In order to load the star, first you have to load the dimensions. You already learned how to load dimension tables. Here you will load the dimensions for the sales star.
Time for action – loading dimensions for the sales datamart
In this tutorial, you will load each dimension for the sales datamart and enclose them into a single job. Before starting, check the following things:
Check that the database engine is up and that both the js
and the js_dw
databases are accessible from PDI.
If your time dimension table, lk_time
, has data, truncate the table. You may do it by using the Truncate table [lk_time] option in the database explorer.
Tip
You may reuse the js_dw
database in which you have been loading data in previous chapters. There is no problem with that. However, creating a whole new database is preferred so that you can see how the entire process works.
The explanation will be focused on the general process. For details of creating a transformation that loads a particular type of dimension, please refer to Chapter 9. You can also download the full material for this chapter where the transformations and jobs are ready to browse and try.
Create a new transformation...
Extending the sales datamart model
You may, and you usually, have more than one fact table sharing some of the dimensions. Look at the following diagram:
It shows
two stars sharing three dimensions: Regions, Manufacturers, and Time. The star model to the left is the sales star model you already know. The star model to the right doesn't have data for accessories, but does have more detail for puzzles such as the number of pieces they have or the category or theme they belong to. When you have more than one fact table sharing dimensions as here, you have what is called a constellation
.
The following table summarizes the dimensions added to the datamart: