-
Type: Bug
-
Status: Resolved
-
Priority: Minor
-
Resolution: Fixed
-
Affects Version/s: 10.10, 11.1-SNAPSHOT
-
Fix Version/s: 10.10-HF10, 11.1, 2021.0
-
Component/s: Streams
-
Release Notes Description:
-
Epic Link:
-
Tags:
-
Sprint:nxplatform 11.1.13
-
Story Points:2
Since NXP-25312 (Nuxeo 10.3) computations have a retry policy.
The policy for the audit log writer computation is:
maxRetries="3" delay="1s" maxDelay="10s" continueOnFailure="false"
Which means 3 retries with 1s exponential backoff delay up to 10s delay, so delays are 1, 2 and 4 seconds, or:
- t: failure
- t+1s: retry 1
- t+3s: retry 2
- t+7s: retry 3
With this configuration the tolerance is 7 seconds shortage, after this, the processor stop and a manual restart is required to resume activity.
There is no good reason to not tolerate a 15min shortage by default.
This could be done like this:
<policy name="AuditLogWriter" ... maxRetries="20" delay="1s" maxDelay="60s" continueOnFailure="false" />
time between retries for the 10 first retries
1, 2, 4, 8, 16, 32, 60, 60, 60, 60 -> 5min05
then 10*60 -> 10min
tolerance: 15min05
The Elasticsearch re-index bulk action can also benefit to use this retry policy to be able to support Elasticsearch failure.
Note that for Nuxeo 9.10 there is no retry policy mechanism, a computation in failure stops the processing and requires a manual restart.