-
Type: Improvement
-
Status: Resolved
-
Priority: Minor
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: 10.3
-
Sprint:nxcore 10.3.8
-
Story Points:2
Since NXP-25313 and NXP-25312 batching and retry mechanism are available at computation level.
This should be used for the AuditLogWriter.
Possible retry policy:
new RetryPolicy.withBackoff(1, TimeUnit.SECONDS, 65)
This should retries 7 times:
- 0s: failure, delay: 1s
- 1s: retry 1, delay: 2s
- 3s: retry 2, delay: 4s
- 7s: retry 3, delay 8s
- 15s: retry 4, delay 16s
- 31s: retry 5, delay 32s
- 63s: retry 6, delay 64s
- 127s: retry 7, end
The computation tolerates a network/audit backend outage of 2minutes.
After this the computation stops and the computation is run on another node.
So the cluster tolerate n*2min outage before failing completly to write to the audit backend,
note that however log entries continue to be appended to the stream without any data loss.
The audit backend access need to be solved manually and a Nuxeo node need to be restarted to continue the processing.