Wednesday, July 31, 2013

CALCPARALLEL in Essbase - Things worth of knowing...

I am having hard time in writing introduction of this post, thought of writing "Parallelism is one of the most important...." , "Essbase BSO calculation engine can use multiple threads" etc....

But, we all know essbase can use multiple processors for calculating data and that can be enabled by SET CALCPARALLEL command. Let's see how to use it wisely.

Essbase admin guide suggests using parallel calculation to improve the performance. Yes, it is, but this is not true in all cases.

If you do not have any backward dependencies and dynamic calcs in your formula, Essbase will decide a calculation into tasks so that it can run these tasks in different threads.CALCTASKDIMS setting will specifies how many of the sparse dimensions in an outline are used to identify potential tasks that can be run in parallel. If CALCTASKDIMS is set to 3, Essbase takes last 3 sparse dimensions into consideration, and determines number of parallel tasks which can run on parallel. The number of parallel tasks is equal to (or apprx equal to) product of all stored members to be calculated in these 3 dimensions( takes only FIX'ed members in calc). Essbase will divide these tasks to run on number of threads equally based CALCPARALLEL setting.

** Essbase v11.1.2.2 is designed to determine number of CALCTASKDIMS itself. So, we don't need to worry about this at this point.

As usual , we can try best CALCPARALLEL setting with trial and error method. Let's see when to use and not use CALCPARALLEL.

For example, I have a BSO application with 13 dimensions (3 Dense+12 Sparse). I ran a calculation script with CALCPARALLEL 6. This is what i found in logs.

Maximum Number of Lock Blocks: [100] Blocks
Completion Notice Messages: [Disabled]
Calculations On Updated Blocks Only: [Disabled]
Clear Update Status After Full Calculations: [Enabled]
Calculator Cache: [Disabled].
OK/INFO - 1012678 - Calculating in parallel with [6] threads.
OK/INFO - 1012679 - Calculation task schedule [3016,71,1].
OK/INFO - 1012680 - Parallelizing using [2] task dimensions. .
OK/INFO - 1012681 - Empty tasks [2797,71,1].
OK/INFO - 1012672 - Calculator Information Message:

From above logs, Essbase automatically decided to use 2 dimensions to identify parallel tasks ( Essbase decided to use 2 dimensions  because i am using 11.1.2.2. Essbase will use 1 task dimension by default in earlier versions). Because of sparsity in my cube, essbase found 2797 Empty tasks out of 3016 identified tasks. 92% of my tasks are empty in this calculation which is bad. So, in this case, using parallelism is not adding up anything for performance even though it reserved 6 processors, it's not even using 10 % of them.

But one interesting observation i made is, above calculation ran faster in serial mode rather than parallel mode. Along with the processors,Essbase is also using some other resources on server to run calc parallel mode. I used only 6 processors (out of 32 processors in the server) only for this calc, But, Essbase  had hard time in managing 6 processors for the calculation where 92% of tasks are empty.
So, bottom line is "DO NOT USE PARALLEL MODE JUST BECAUSE YOU HAVE RESOURCES AVAILABLE. USE PARELL CALC BASED ON NON EMPTY TASKS"

So, when to use calcparellel?

I will recommend parallel calculation if Empty tasks are at least 40% of identified tasks. We can play around with order of dimensions in the outline and CALCTASKDIMS settings to reduce number of parallel tasks and empty tasks. We decide number processors based on the resource available and other things running on the server etc... Better start with 2 processors.

We recently upgraded from 11.1.2.0 to 11.1.2.2. A guy from Oracle development team told me that they enhanced parallelism in new version. Instead of improving the performance, it has deprived after up gradation. We have tuned parallelism in calcs which are now running better than previous ones. 11.1.2.2 is doing better job in analyzing parallel tasks than previous version. So, tune your calcparallel if calcs are running longer in 11.1.2.2 compared to previous versions.



Tuesday, July 30, 2013

How XWRITE and XREF can be used …………….

The main intension of writing this Post is to describe how @XREF and @XWRITE can work in similar way.
Sometimes you will be in the need of writing or copying the data between cubes either they can be from within the application or they can be from remote server application.
@XWRITE and @XREF are two calculation commands can be used for such operations.
For example you have two databases(cubes ) called A1 and B1 and they both share different outline structure like below.
Cube :: A1 Outline looks like below.
Account
Sales
Expenses
Period
Q1
Jan
Feb
Mar
Market
East
West
Scenario
Actual
Budget
Year
2011
2012
2013
Cube :: B1 Outline looks like below.
Account
East_Sales
East_Expenses
Period
Q1
Jan
Feb
Mar
Department
Dep_101
Dep_102
Year
2011
2012
2013
 
@XREF: For example East_Sales->Jan->Dept_101->2011intersection of B1 cube has to get data from sales->Jan->East->Budget->2011 intersection in the A1 cube. Nothing but we are copying the data from A1 cube to populate data in B1 cube .
Below calculation script can be referred in such cases,which is written under B1 cube.
FIX(“Jan”,”Dept_101”,”2011”)
“East_Sales”=@XREF(_A1alias_,”Sales”,”Budget”);
ENDFIX

  • _A1alias_  is location alias name of A1 cube which acts as source for @XREF from where we get the data. B1 is called as target to where we are copying the data.
  •  @XREF always refer the data cell by taking the combination of Members names given in the Fix statement and the members given in the @XREF. Here Sales and Budget are members from A1 cube where as we still have this calculation under B1 cube .
  • Whenever we planned to get the value for some intersection by referring other cube we have to run this calculation. So whenever we run this calculation it will always get into A1 cube and search for the intersection to get data .As we know it will take some time to get the data from another cube .
  • If you want to copy more than one member from the same dimension we can write multiple @XREF statements.
  • We can alse use this command in member formulas.
 
@XWRITE: Even this works in the same way but source and target are interchanged here. That is we write this calculation script in Source cube that is in A1 cube in our case. So whenever you think data is ready we can run the below calculation to write the data from A1 cube to B1.
FIX(“Jan”,”East”,”Budget”,2011”)
“Sales”
(
@XWRITE(“Sales”,_B1alia_,”East_Sales”,”Dept_101”);
)
ENDFIX

  • If you want to perform same kind of operation in any of your requirement and I would suggest using @XWRITE over @XREF because @XWRITE will write the data to the target cube whenever we have data in the source cube and it is faster .whereas @XREF always has to fetch data from other cube when we requested ,so it will take time as it depends on source cube availability and we don’t know whether we have data available for the intersection we are seeking. If we don’t have it doesn’t make any sense running the calculation .
  • When you have a situation like cube will be accessed  a lot and so many calculations might be running frequently, where as you have to populate some cells by referring other cube .If you use XREF to get those values ,it will increase your calculation time .In such situation you can go with XWRITE.
  • Using @XWRITE we can also write data to same cube itself by using @LOOPBACK function like followed statement @XWRITE(“Sales”,@LOOPBACK,”Expenses”);
  • We can also use @XWRITE command in member formula calculations.
  •  @XREF can be used for different purpose as specified in the technical reference, it has its own advantages.

Deepa