PowerPivot data is in the worksheet

In a previous blog entry, I talked about developing a PowerPivot model. The model defines the data and the relationships to be used in a PowerPivot application. As the PowerPivot model is defined, the data is actually imported into the Excel spreadsheet using a compression technology called VertiPaq. The first thing you notice is that .xlsx file gets quite large. But not as large as the data in the data source. This is because the PowerPivot model stores the data in compressed format using a technology called VertiPaq.

You can look at the VertiPaq database internally by making a copy of your spreadsheet then changing the extension to .zip (from.xlsx) and then opening the .zip file. You should see a sub-folder named  xl \customData. Within that folder you should see one or more .data files of significant size depending on your data. This is the VertiPaq database.

In my testing, I compared using the PowerPivot approach versus the native Analysis Services Cube stored remotely. With PowerPivot, once you’ve defined the PowerPivot model and imported the data, all processing is local to the client. This can be good and bad. Obviously, there is no further strain on the servers after import but any complex aggregation has to be handled by the client. Effectively, Office 2010 is building a temporary cube on the fly and performing the aggregations itself. Performance-wise there was no comparison. Running SSAS on a separate server with a purpose-built cube using the Excel 2010 client to submit MDX queries under the hood worked very efficiently. This is the OLAP architecture of choice. Working with the same data using PowerPivot produced the same results but in much slower response time – seconds vs minutes. The fact that I was dealing with millions of rows was obviously a factor so it is truly amazing technology that the client can handle this processing at all.

But don’t take my word for it. Test it out yourself. Here are the samples I used in testing:

Contoso Retail BI samples (Data warehouse and Cube): http://www.microsoft.com/downloads/details.aspx?familyid=868662DC-187A-4A85-B611-B7DF7DC909FC&displaylang=en

Contoso Retail PowerPivot samples: http://www.microsoft.com/downloads/details.aspx?familyid=1AE63BFB-C303-44E3-AE44-7413D499495D&displaylang=en

PowerPivot was developed with the objective of “Self-Service BI”, so in situations where there are various data sources that need to be analyzed on an ad-hoc basis with the power in the hands of the client, this will be a very useful innovation. The big payoff is that Excel 2010 now can handle hundreds of millions of rows of data so whether you go with PowerPivot or straight SSAS or a combination of both there are major benefits with the latest release.

More on PowerPivot in the next Blog entry.

Cheers

Brian

In this article

Join the Conversation