Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-26748

Don't drop works under load w/StreamWorkManager/Kafka/storestate.enabled

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Won't Fix
    • Affects Version/s: 9.10, 10.10
    • Fix Version/s: None
    • Component/s: Events / Works
    • Tags:
    • Backlog priority:
      600
    • Sprint:
      nxcore 11.1.4, nxcore 11.1.5
    • Story Points:
      3

      Description

      With StreamWorkManager, Kafka, and nuxeo.stream.work.storestate.enabled=true, the server logs intermittently include the following WorkComputation WARN message indicating that the StreamWorkManager consumer is dropping a Work:

      https://github.com/nuxeo/nuxeo/blob/5d81a4f7a93adedbd134e052e5f1c9039933c6fb/nuxeo-core/nuxeo-core-event/src/main/java/org/nuxeo/ecm/core/work/WorkComputation.java#L96

      This typically occurs when lots of Works are being scheduled concurrently. In this case, it appears a state synchronization issue exists between the MongoDB KeyValueStore and the Kafka StreamWorkManager. See:

      https://github.com/nuxeo/nuxeo/blob/5d81a4f7a93adedbd134e052e5f1c9039933c6fb/nuxeo-core/nuxeo-core-event/src/main/java/org/nuxeo/ecm/core/work/StreamWorkManager.java#L155

      https://github.com/nuxeo/nuxeo/blob/5d81a4f7a93adedbd134e052e5f1c9039933c6fb/nuxeo-core/nuxeo-core-event/src/main/java/org/nuxeo/ecm/core/work/StreamWorkManager.java#L160

      and

      https://github.com/nuxeo/nuxeo/blob/5d81a4f7a93adedbd134e052e5f1c9039933c6fb/nuxeo-core/nuxeo-core-event/src/main/java/org/nuxeo/ecm/core/work/WorkComputation.java#L96

      Specifically, our suspicion is that WorkComputation line 96 is being invoked from one thread in between the time lines 155 and 160 are invoked in another thread thus causing the dropped Work.

        Attachments

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0 minutes
                0m
                Logged:
                Time Spent - 4 hours
                4h

                  PagerDuty

                  Error rendering 'com.pagerduty.jira-server-plugin:PagerDuty'. Please contact your Jira administrators.