On Analytics and Everything else: February 2017

Introduction

In all development areas, performance becomes an important consideration. Performance is a good efficiency measure that should be taken seriously. Some might say that just throw hardware at any performance issue and that will fix the issue but sadly that is not the case. Qlikview is a memory intense Business Intelligence tool itself and granted, throwing more memory at it would fix the problem but there are still many considerations to think about because you find in some organisations that budgets could be an issue so it’s important to streamline the model as much as possible.

In some cases, as the data grows, you find that there is a need to optimise your model but in principle, performance should be at the top of your agenda during development anyway.

Method

To effectively get all the areas needed for performance improvement, we would need a way to see the metadata of the model. Effectively this is all the information such as tables, dimensions, expressions, etc. that would map to the total usage of memory used. Rob Wunderlich, has written a document analyser that would allow us to do this.

Steps:

1. Go to: http://qlikviewcookbook.com/tools/#squelch-taas-accordion-shortcode-content-3 and download the QV Document Analyser.

2. Store it anywhere on your disk but preferably, where the model is stored

3. Create a copy, one that will have metadata before analysis and the other will have analysis after performance changes.

4. Open the analyser and point it to the model you want to improve.

a. In our case, we will be doing the Unit Testing model I built before.

Results

The data used for this test comes from the MySQL localhost sample data. There is roughly around 4000 rows in the largest table. Although the data is not substantial, it is enough to show the concepts that we are trying to pass through in this post. An increased sized data model would show this better.

Before Tuning

1. The analysis took 5 seconds

2. It took 1.3 seconds to open the document

3. 28 fields were used. (In Client models, you’d find a fair number of fields being loaded but never used)

4. It takes 141 milliseconds to open an object

5. Uses 1.07MB of RAM

As we can see with the recommendations, there are a few things we can do to improve this model and that is to autonumber two of the keys in the model. In a bigger model, the recommendations will be so much more, each showing the amount of memory you could save by implementing the changes.

The memory used before the recommendations were implemented stood at: 1,118,049 bytes.

After Tuning

1. The analysis took 5 seconds

2. It took 1.3 seconds to open the document

3. 28 fields were used. (In Client models, you’d find a fair number of fields being loaded but never used)

4. It takes 125 milliseconds to open an object

5. Uses 1.07MB of RAM

The memory used after the recommendations were implemented now stands at: 1,117,571 bytes

Analysis

The results above show the power of using the document analyser to improve your model. But this is just not the only way to improve performance. There are a number of other things that one can go about looking at to improve performance.

Remove Synthetic Keys

Synthetic keys are formed if there two or more tables have common columns exist. To solve this, we can rename the concerned fields, join the tables concerned or just not load the fields depending on the need of the model.

Use of Autonumber

“This script function returns a unique integer value for each distinct evaluated value of expression encountered during the script execution.” (https://help.qlik.com/en-US/qlikview/12.1/Subsystems/Client/Content/Scripting/CounterFunctions/autonumber.htm)

Using this function, we can create keys based on integers instead of strings or other data type that would use more system resources.

Model Towards a Star Schema

Although in some cases, there is a need for a snowflake schema because of the structure of the data, having a star schema greatly influences the performance of the model. This all has to do with the number of hops that Qlikview has to do. Please see: https://analyticsandeverythingelse.blogspot.co.za/2016/03/star-schema-vs-snowflake.html regarding hops.

Using Simple Expressions

It is worth noting that all complicated expressions should be done in the backend script, preferably in a Tier 2 model whereby all the heavy lifting will be done. The model will perform so much better if for example we had an expression such as: sum ({<isCalculated = {‘Yes’}>} Sales) instead of the whole calculation if isCalculated.

Dropping Temp tables and unused fields

By dropping unused fields and temporary tables, the model will remove these from memory and thus making it more efficient to use. If we load these, these will just reside in memory while not doing anything constructive to the overall usage of the model. Also, if left unattended, these could mess up your model and you could end up with a slow model.

Circular Loops

It is undesirable to have multiple common keys across multiple tables ia a Qlikview data structure. this may cause Qlikview to use Circular references to generate the connections in the data structure. Circular references are generally resource heavy and may slow down calculations and, in extreme cases, overload the application.To avoid this, we can:

Comment the fields in the load script
Rename the fields in the load script
Rename the fields using the UNQUALIFY operator

Conclusion

Performance is very important. In the case of Qlikview, this is important as Qlikview relies heavily memory. It is crucial that we do not build models that would otherwise kill over the machine. Having a lean, memory efficient model could be the difference in business making crucial decisions and being delayed due to the performance of an ill-designed model.

Friday, 24 February 2017

Improving Performance In Qlikview Apps