Reader small image

You're reading from  Instant Creating Data Models with PowerPivot How-to

Product typeBook
Published inApr 2013
PublisherPackt
ISBN-139781849689564
Edition1st Edition
Right arrow
Author (1)
Taehyung Lee
Taehyung Lee
author image
Taehyung Lee

Leo Taehyung Lee is an outgoing individual with a diverse background and experiences. His studies in engineering along with the pursuit of a career in finance helped him work in various positions across diverse fields from R&D to patents as well as financial risk management. He has over 3 years of extensive experience in Excel, VBA, and is a Microsoft Certified Excel Expert in Excel 2010.
Read more about Taehyung Lee

Right arrow

Filtering the data to be imported (Intermediate)


PowerPivot stores all the data that it imported from other sources. Despite the fact that PowerPivot is able to handle millions of rows of data (based on memory availability), it is always more efficient and effective to only import the relevant data. We will re-import data from the database and from Excel, and filter the data in the process.

Getting ready

In order to filter data, it is necessary for us to understand which data is relevant and which is not. The term "more the merrier" does not apply in this case, as we will only use these filtered data that are relevant throughout this book.

How to do it...

Reproduce the steps for importing data from the database as follows:

  1. In Excel 2010, open up the PowerPivot window. Select From Database | From SQL Server. Select the details as shown in the following screenshot. If the server name is not available, then type in PC-NAME\SQLEXPRESS.

  2. However, this time, click the Preview & Filter button as highlighted in the following screenshot for each of the 13 tables.

  3. It will display a preview of every table that we are importing—and we haven't explored what each column means. However, one easy thing is to remove logs and other items that may be necessary for the database, but not for us. This may include logs, last modified date, last user modified, and so on. For the first table Address, deselect the last two columns rowguid and ModifiedDate as shown in the following screenshot:

  4. After deselecting, click on OK and we will see that new comment, Applied filters, will appear for the table under the column Filter Details, as shown in the screenshot in step 2. Repeat this procedure for every table, deselecting the data that we are sure will not be useful. Save the final datafile as DatabaseData_v2.xlsx. The file should be about 5 MB, around 11 to 12 MB smaller than the original one.

  5. Similarly, repeat the step for the GDP datafile by deselecting the columns from 1970 to 1999. Import and then save the final data as GDPData_v2.xlsx. This filtered data should be about 375 KB, less than half of the size of the original file that would be 848 KB.

How it works...

Before, even though we only selected 13 tables out of the whole database, each table still contained unnecessary items such as logs. Depending on how large the database is, the amount of logs may be very large and space-consuming, and by simply filtering out those logs and/or other irrelevant data, we are able to work more efficiently and effectively with the correct sets of data. PowerPivot allows this by enabling the user to filter the data during the importation process by unchecking the unnecessary columns, so that only the necessary columns that were checked are imported, as we have previously practiced.

By simply comparing the file size, it is evident that filtering data becomes more necessary as the amount of data for analysis becomes larger. The data that we imported from the database is much less than hundreds of thousands of rows, but we already managed to save about 35 percent of the file size (and the corresponding calculation time) by unchecking non-significant columns during the import stage.

There's more...

As we become more familiar and proficient with data, we can even become more selective, selecting only the most relevant tables with the necessary columns. However, it requires a complete understanding of the database structure.

Previous PageNext Page
You have been reading a chapter from
Instant Creating Data Models with PowerPivot How-to
Published in: Apr 2013Publisher: PacktISBN-13: 9781849689564
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Taehyung Lee

Leo Taehyung Lee is an outgoing individual with a diverse background and experiences. His studies in engineering along with the pursuit of a career in finance helped him work in various positions across diverse fields from R&D to patents as well as financial risk management. He has over 3 years of extensive experience in Excel, VBA, and is a Microsoft Certified Excel Expert in Excel 2010.
Read more about Taehyung Lee