This document provides a list of ideas to improve the user response in relation with worklist.
Table of Contents
Worklist calculation on large ERP5 production systems is one of the most I/O intensive calculations. Because worklists are calculated at login time, it can lead to a slow login.
A worklist is calculated by counting the number of documents which match certain criteria.
SELECT COUNT(DISTINCT uid)
WHERE security_uid IN (a list of security UIDs)
AND validation_state = 'draft'
AND whatever else to define a predicate
Each calculation is relatively fast (ex. 0.1 second). But with 100+ worklists in ERP5 and hundreds of concurrent users, worklist calculation can make the SQL backend of ERP5
The parameters to this problem are:
Depending on these figures, the approach to increase speed can be different.For this reason, any optimization in ERP5 core should be flexible enough to protect a flexible
declarative approach to define worklists and a flexible optimization approach.
If P or S are small (ex. < 100), then each individual worklist calculation will be extremely fast.
If P or S are big (ex. > 100,000), C is small (ex. 10) and G is big (ex. 1000), the individual worklist calculation can be quite slow.
If N is large (ex. > 1000), there is no hope to be able to precalculate or cache something very easily.
Here are some typical cases:
This goes again the principle in ERP5 "never store results of calculation". So, forget it.
The current optimization of worklist calculation is based on the idea of precalculating certain values of worklists for certain groups of users and predicates (ex. draft state
for ZPPB security group). Final worklist calculation is then done in near real time.
Still, this fails in some cases. It does not work apparently for all kinds of predicates. In situations where P, S and G are large, it fails to provide good performance.
Worklists could be calculated in advance.
This is already what we have with a caching approach (5 minutes). It could be improved easily by having a caching policy similar to what is used for web: display cache content
(even old), trigger cache content update.
The main issue is here is thus: which cache content update should be triggered, for which user and when.
If the number of users is very small (< 10), it does not really matter. But if it is large, then this becomes more complex. A good algorithm for predictive cache update could use
the following principles:
When rendering the front page of ERP5, do not calculate worklist.
Either display an empty menu and let calculation happen later (when user clicks on menu through AJAX). Or display what was previously calculated (non AJAX).
Whenever user clicks on menu, gather most recent worklist.
Whenever calculation is finished, push menu values (using websockets, long polling, hookbox) as for example what happens in facebook (worklists are updated while using the same
page). Change the color of menu to show that something happened.
Each time a workflow transition happens, trigger the update of all worlists for relevant users.
The key here is to make sure that not too many updates are triggered. Imagine for example that N=1,000,000. The following tactics can be considered:
Keep a dedicated database or cluster to calculate worklists. It does not matter if it is not in perfect sync with the main database for the catalog. Multiple database can be
Instead of triggering a worklist cache update for a given user, try to guess what should be the next value of the worklist. For this:
Of course, this will not work for worklists which are based on a date value (ex. all evens which must be processed within 24 hours). So, it can only be considered as a kind of
By tracking the result of content cache update, and tracking the "time to click" of users to read the content of a worklist, automatically decide the relevant update frequency
This is especially useful for social networks. Generate an acknowledgement document for each document of a worklist and for each user. Then calculate the worklist based on
This is useful to track whether a user checked or not his worklist.
Old acknowledgements can be erased.
NOTE: acknowledgement documents should be part of portal_acknowledgements, not of event_module
Appropriate worklist optimization management must be based on a propper API and plugin architecture, a bit like portal_caches.