Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-31201

Improve retry tuning when Elastic is overloaded

    XMLWordPrintable

    Details

    • Release Notes Summary:
      The retry delay has been increased to handle when elastic is overloaded.
    • Team:
      PLATFORM
    • Sprint:
      nxplatform #69, nxplatform #70
    • Story Points:
      3

      Description

      Since NXP-30841 there is a back pressure applied when Elastic circuit breaker is activated.
      There are 2 different retries:

      • for bulk indexing commands (elastic bulk not Nuxeo bulk) with 3 retries in a range of [t+105s, t+315s] (t being the time of the first error).
      • for single indexing commands with 3 retries and a shorter range of [t+7s, t+21s]

      The first retry configuration looks ok to wait for the pressure to go down and avoid an error, but we have seen that the second retry is too short and we have seen errors after 3 retries.

      The shorter configuration was done because single index command can be used in sync mode, this needs to be reviewed if we can distinguish sync and async command and increase the backoff duration.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: