Bridge table solution
We will use a simplified book sales dimensional model as an example to demonstrate our bridge solution.
Our book sales model initially has the SALES_FACT fact table and two dimension tables: BOOK_DIM and DATE_DIM. The granularity of the model is sales amount by date (daily) and by book.
Assume the BOOK_DIM table has five rows:
-
|
BOOK_SK
|
TITLE
|
AUTHOR
|
|
1
|
Programming in Java
|
King, Chan
|
|
3
|
Learning Python
|
Simpson
|
|
2
|
Introduction to BIRT
|
Chan, Gupta, Simpson (Editor)
|
|
4
|
Advanced Java
|
King, Chan
|
|
5
|
Beginning XML
|
King, Chan (Foreword)
|
The DATE_DIM has two rows:
-
|
DATE_SK
|
DT
|
|
1
|
11-DEC-2009
|
|
2
|
12-DEC-2009
|
|
3
|
13-DEC-2009
|
And, the SALES_FACT table has ten rows:
-
|
DATE_SK
|
BOOK_SK
|
SALES_AMT
|
|
1
|
1
|
1000
|
|
1
|
2
|
2000
|
|
1
|
3
|
3000
|
|
1
|
4
|
4000
|
|
2
|
2
|
2500
|
|
2
|
3
|
3500
|
|
2
|
4
|
4500
|
|
2
|
5
|
5500
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
|
|
3
|
3
|
8000
|
|
3
|
4
|
8500
|
Note that:
The columns with _sk suffixes in the dimension tables are surrogate keys of the dimension tables; these surrogate keys relate the rows of the fact table to the rows in the dimension tables.
King and Chan have collaborated in three books; two as co-authors, while in the “Beginning XML” Chan’s contribution is writing its foreword. Chan also co-authors the “Introduction to BIRT”.
Simpson singly writes the “Learning Python” and is an editor for “Introduction to BIRT”.
To analyze daily book sales, you simply run a query, joining the dimension tables to the fact table:
SELECT dt, title, sales_amt
FROM sales_fact s, date_dim d, book_dim b
WHERE s.date_sk = d.date_sk
AND s.book_sk = b.book_sk
This query produces the result showing the daily sales amount of every book that has a sale:
-
|
DT
|
TITLE
|
SALES_AMT
|
|
11-DEC-09
|
Advanced Java
|
4000
|
|
11-DEC-09
|
Introduction to BIRT
|
2000
|
|
11-DEC-09
|
Learning Python
|
3000
|
|
11-DEC-09
|
Programming in Java
|
1000
|
|
12-DEC-09
|
Advanced Java
|
4500
|
|
12-DEC-09
|
Beginning XML
|
5500
|
|
12-DEC-09
|
Introduction to BIRT
|
2500
|
|
12-DEC-09
|
Learning Python
|
3500
|
|
13-DEC-09
|
Advanced Java
|
8500
|
|
13-DEC-09
|
Learning Python
|
8000
|
You will notice that the model does not allow you to readily analyze the sales by individual writer—the AUTHOR column is multi-value, not normalized, which violates the dimension modeling rule (we can resolve this by creating a view to “bundle” the AUTHOR_DIM with the SALES_FACT tables such that the AUTHORtable connects to the view as a normal dimension. We will create the view a bit later in this section). We can solve this issue by adding an AUTHOR_DIM and its AUTHOR_GROUP bridge table.
The AUTHOR_DIM must contain all individual contributors, which you will have to extract from the books and enter into the table. In our example we have four authors.
-
|
AUTHOR_SK
|
NAME
|
|
1
|
Chan
|
|
2
|
King
|
|
3
|
Gupta
|
|
4
|
Simpson
|
The weighting_factor column in the AUTHOR_GROUP bridge table contains a fractional numeric value that determines the contribution of an author to a book. Typically the authors have equal contribution to the book they write, but you might want to have different weighting_factor for different roles; for example, an editor and a foreword writer have smaller weighting_factors than that of an author. The total weighting_factors for a book must always equal to 1.
The AUTHOR_GROUP bridge table has one surrogate key for every group of authors (a single author is considered a group that has one author only), and as many rows with that surrogate key for every contributor in the group.