Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-29676

Stream Scalability #2



    • Type: Epic
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 11.x
    • Fix Version/s: None
    • Component/s: Streams
    • Team(s):


      Following the unterminated epic NXP-28001 in a different way.
      we want to improve the scaling of an existing stream processing.

      Uses case is:
      A mass processing based on a Bulk Action is required, for instance, it could be to recompute all the thumbnails for the entire repository.
      The existing configuration for the Bulk Action can support the daily usage but will be too slow for this mass processing. We want to allocate new resources to speed up the processing without altering the normal usage of the application. Also, we want to avoid very slow processing that could take longer than Kafka retention (7 days).

      The possible steps to handle this case are:

      • Increase the number of partitions of the Kafka topic
      • Start dedicated worker nodes using a special profile, it could be interesting to use AWS spot instances
      • Run the bulk action
      • Start more worker nodes if needed
      • Shutdown the dedicated worker nodes on completion
      • Have a way to mark the topic to be dropped during the next cluster restart in order to reduce the number of partition

      Another possibility is to start these worker nodes with a dedicated stream configuration for the bulk service,
      The bulk command needs to be rooted to these worker nodes, the bulk status will be available from any nodes in the cluster,
      once terminated, the nodes can be shutdown and topics deleted.


          Issue Links



              • Assignee:
                bdelbosc Benoit Delbosc
                bdelbosc Benoit Delbosc
              • Votes:
                1 Vote for this issue
                3 Start watching this issue


                • Created: