Tuesday, December 31, 2013

Optimizing Report Scripts in ASO Applications

Report script is one of the powerful reporting options in ESSBASE. Report scripts are easy to write and give us flexibility to select the members and format we need.

Performance will always play a key role here. How do we get best performance from a report script? As like many other areas in essbase, it's achievable by tweaking and testing various settings.

Let's see the report script in a different angle, 

Based on my understanding, essbase engine runs a report in two phases. One, fetching the data and preparing the data in the format we need, Two, writing the report to an output file. There are different options to improve both phases.

Let's start from second phase (this is where I get more performance )  ,  where data will be exported to a file.

I will always get dramatic performance improvement when I arrange dimensions in row command in such a way that  the dimension with least number of members that  you are exporting in a report comes left most and dimension with most number of members comes right most ( ascending ). 

**** Here is the confusion part; I said number of members in your report script, not members in your outline. There is difference between two statements.

For example, I have an outline with 5 dimensions ,

Account - with 10000 members
Entity - with 5000 members 
Product - with 500 members
Market - 50 members
Time - with 12 months 

I want to export a report with

100 Accounts
5000 Entities
500 Products 
50 Markets
1 Month

Then, my report looks like,

<ROW( "Time", "Market","Accounts" ,"Products","Entities")

As I said earlier, I arranged my dimensions in ascending order based on number of members in report that I am exporting, not based on number of members in outline. This trick drastically improves second phase of the report i.e writing report to a output file. Some of my reports are improved from 100KB/Sec to 5MB/Sec.

Coming to the first phase,
Cache settings on application and database plays main role.
"Pending cache setting" on application level confuses me a lot. As per Database Administrator Guide

"If the input-level data size is greater than 2 GB by some factor, the aggregate storage cache can be
increased by the square root of the factor. For example, if the input-level data size is 3 GB
(2 GB * 1.5), multiply the aggregate storage cache size of 32 MB by the square root of 1.5, and
set the aggregate cache size to the result: 39.04 MB."

But, that never worked for me for cube with input-level data more than 5 GB. I got good performance when i set very higher than the recommendation formula. For example, i got optimal performance on reports for an ASO cube if i set pending cache to  512 MB when input-level data is 10 GB and 1024 MB when input-level data is 40 GB. I do not have a specific recommendation for pending cache , but its better to test with various values and find the sweet spot.

"Buffer size" and "Sort buffer size" also helps to speed up the report scripts. Play around with different settings to find out best for you.

Building aggregations on a cube will speed up the report script if you are pulling data on parent level.

You can also try to use <LEAVES instead of <DIMBOTTOM. Try to avoid dynamic members from member selection.