[NXP-29932] Stream Introspection - Nuxeo Issue Tracker

XML

Word

Printable

Details

Type: Epic
Status: Open
Priority: Minor
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Streams

Tags:
- nxplatform
Team(s):

PLATFORM
Completion Level (0 to 5):
5

Description

The goal is to have an overview of the stream processing at the cluster level in order to:

quickly understand where there is a bottleneck or problem without having to ssh
take a decision on how to scale-out, scale-down, tune the existing configuration
build a dedicated scaling metric that can be used by a horizontal auto scaler (HPA).

Because Nuxeo Stream is used at a low level it will cover all async processing: Async listeners, WorkManager, Bulk Service, and of course Nuxeo Stream (when using Kafka).

We want a representation at the cluster level that includes:

all streams used with their number of partitions
all Nuxeo nodes that participate in the async processing, with the number of threads for each computation
the lag and latency for each consumer group
computations failures
eventually for each node: CPU usage, JVM memory pressure

The idea is to report all processor topologies on node start ~~NXP-29934~~) and create a specific stream metrics reporter (~~NXP-29933~~) that informs about activities. A computation will aggregate both streams and build a representation that will be exposed as REST (~~NXP-29935~~).

Attachments

Issue Links

is related to

NXP-29945 Provide a tool to analyze the content of the default WorkManager queue

Resolved

Activity

People

Assignee:

Benoit Delbosc

Reporter:

Benoit Delbosc

Participants:

Benoit Delbosc

Owner:

Benoit Delbosc

Votes:

0 Vote for this issue

Watchers:

1 Start watching this issue

Dates

Created:

2020-12-04 07:55

Updated:

2021-01-12 10:34