# Normalizing Dimensional Model

Your details (so we can tell your friend who this is from) *
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
9
K
k
G
g
k
by Djoni Darmawikarta | January 2010 | Architecture & Analysis

In the Solving Many-to-many Relationship in Dimensional modeling article published in December 2009, a fact is added to solve the author-to-book many-to-many relationship in the book sales dimensional model, effectively normalizing the author out of the book dimension into its own author dimension.

In this article by Djoni Darmawikarta, we will discuss how to solve other denormalised structures in a dimensional model.

# First Normal Form Violation in Dimension table

Let’s revisit the author problem in the book dimension. The AUTHOR column contains multiple authors, a first-normal-form violation, which prevents us from querying book or sales by author.

BOOK dimension table

 BOOK_SK TITLE AUTHOR PUBLISHER CATEGORY SUB-CATEGORY 1 Programming in Java King, Chan Pac Programming Java 2 Learning Python Simpson Pac Programming Python 3 Introduction to BIRT Chan, Gupta, Simpson (Editor) Apes Reporting BIRT 4 Advanced Java King, Chan Apes Programming Java

Normalizing and spinning off the authors into a dimension and adding an artificial BOOK AUTHOR fact solves the problem; it is an artificial fact as it does not contain any real business measure.  Note that the Editor which is an author’s role is also “normalized” into its column in the AUTHOR table (It is related to author, but not actually an author’s name).

AUTHOR table

 AUTHOR_SK AUTHOR_NAME 1 King 2 Chan 3 Simpson 4 Gupta

BOOK AUTHOR table

 BOOK_SK AUTHOR_SK ROLE COUNT 1 1 Co-author 1 1 2 Co-author 1 2 3 Author 1 3 2 Co-author 1 3 3 Editor 1 3 4 Co-author 1 4 1 Co-author 1 4 2 Co-author 1

Note the artificial COUNT measure which facilitates aggregation always has a value of numeric 1.

`&lt;!--mce:2-->SELECT name, SUM(COUNT) FROM book_author ba, book_dim b, author_dim aWHERE ba.book_sk = b.book_sk AND ba.author_sk = a.author_skGROUP BY name`

You might need to query sales by author, which you can do so by combining the queries of each of the two stars (the two facts) on their common dimension (BOOK dimension), producing daily book sales by author.

`SELECT dt, title, name, role, sales_amt FROM (SELECT book_sk, dt, title, sales_amt FROM sales_fact s, date_dim d, book_dim b WHERE s.book_sk = b.book_sk AND s.date_sk = d.date_sk) sales,  (SELECT b.book_sk, name, role FROM book_author ba, book_dim b, author_dim a WHERE ba.book_sk = b.book_sk AND ba.author_sk = a.author_sk) authorWHERE sales.book_sk = author.book_sk`

# Single Column with Repeating Value in Dimension table

Columns like the PUBLISHER, though not violating any normal form, is also good to get normalized, which we accomplish by adding an artificial fact, PUBLISHED BOOK fact, and its own dimension, PUBLISHER dimension.

This normalization is not exactly the same as that in normalizing first-normal-form violation; the PUBLISHER dimension can correctly be linked to the SALES fact, the publisher surrogate key must be added though in the SALES fact.

PUBLISHER table

 PUBLISHER_SK PUBLISHER 1 Pac 2 Apes

BOOK PUBLISHER table

 BOOK_SK PUBLISHER_SK COUNT 1 1 1 1 2 1 2 3 1 3 2 1 3 3 1 3 4 1 4 1 1 4 2 1

# Related Columns with Repeating Value in Dimension table

CATEGORY and SUB-CATEGORY columns are related, they form a hierarchy. Each of them can be normalized into its own dimension, but they need to be all linked into one artificial fact.

# Non-Measure Column in Fact table

The ROLE column inside the BOOK AUTHOR fact is not a measure; it violates the dimensional modeling norm; to resolve we just need to spin it off into its own dimension, effectively normalizing the fact table.

ROLE dimension table and sample rows

 ROLE_SK ROLE 1 Author 2 Co-Author 3 Editor

BOOK AUTHOR table with normalized ROLE

 BOOK_SK AUTHOR_SK ROLE_SK COUNT 1 1 2 1 1 2 2 1 2 3 1 1 3 2 2 1 3 3 3 1 3 4 2 1 4 1 2 1 4 2 2 1

# Summary

This article shows that both dimensional table and fact table in a dimensional model can be normalized without violating its modeling norm.

## Oracle SOA Suite Developer's Guide

 Design and build Service-Oriented Architecture Solutions with the Oracle SOA Suite 10gR3
Published: March 2009
eBook Price: \$47.99
Book Price: \$79.99
See more

Djoni Darmawikarta built his career in IBM Asia Pacific and Canada as a software engineer, international consultant, instructor and project manager, for a total of 17 years. He's currently a technical specialist in the Data Warehousing and Business Intelligence team of a Toronto-based insurance company. Outside of his office works, Djoni writes IT articles and books.