Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-30536

Provide options at nuxeo.conf level to tune Bulk Re-indexing

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 10.10, 2021.6
    • Fix Version/s: 10.10-HF52, 2021.8
    • Component/s: Bulk, Elasticsearch
    • Release Notes Summary:
      Options to tune bulk Elasticsearch reindexing are available in nuxeo.conf.
    • Upgrade notes:
      Hide

      You now have the following options in nuxeo.conf

      # Bulk Index action, fetching content (bulk/index computation)
      elasticsearch.bulk.index.fetch.concurrency=4
      elasticsearch.bulk.index.fetch.partitions=12
      # Bulk Index action, submitting requests to elastic (bulk/bulkIndex computation)
      elasticsearch.bulk.index.submit.concurrency=2
      elasticsearch.bulk.index.submit.partitions=8
      

      Where concurrency is the number of threads per node and partitions value fixes the maximum concurrency at the cluster level.
      Note that partitions value is taken into account only when creating Kafka topic.

      Show
      You now have the following options in nuxeo.conf # Bulk Index action, fetching content (bulk/index computation) elasticsearch.bulk.index.fetch.concurrency=4 elasticsearch.bulk.index.fetch.partitions=12 # Bulk Index action, submitting requests to elastic (bulk/bulkIndex computation) elasticsearch.bulk.index.submit.concurrency=2 elasticsearch.bulk.index.submit.partitions=8 Where concurrency is the number of threads per node and partitions value fixes the maximum concurrency at the cluster level. Note that partitions value is taken into account only when creating Kafka topic.
    • Team:
      PLATFORM
    • Sprint:
      nxplatform #42, nxplatform #43
    • Story Points:
      0

      Description

      In Nuxeo works elasticsearchIndexing we can set the indexing threads as a configuration parameter: elasticsearch.indexing.maxThreads=4

      For Cloud, we don't have the ability to build our own contribution. Thus we need the same for bulk-index so we can ramp up the thread count for more index throughput for large repos when using BAF.

      We can control the partition count but need to adjust the threads.

      This is critical because we have pending large re-indexing tasks in the next few weeks.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: