[NXP-29705] Possible work pool termination with StreamWorkManager - Nuxeo Issue Tracker

XML

Word

Printable

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 10.10-HF29
Fix Version/s: 10.10-HF34, 11.3, 2021.0
Component/s: Events / Works, Streams

Release Notes Summary:
Work pool terminations are better handled with StreamWorkManager.
Tags:
- SupCom
- nxplatform
Backlog priority:
600
Team:
PLATFORM
Sprint:
nxplatform #18
Story Points:
3

Description

When using the StreamWorkManager, It seems there is a code path where failures can lead to a termination of the Work Queue thread pool in a silent way (without tracing errors).

This has been observed with indexing Work (ScrollingIndexingWorker) that run a MongoDB query which timeouts (SocketTimeout). The thread that runs the Work as a computation terminates without tracing the error and the record is re-assigned on other threads, where the same behavior is seen until there are no more indexing threads in the cluster.

It is not clear if work in failure are put in the DLQ stream.