-
Type: Improvement
-
Status: Resolved
-
Priority: Major
-
Resolution: Fixed
-
Affects Version/s: 5.8.0-HF24, 6.0
-
Fix Version/s: 5.8.0-HF28, 6.0-HF02, 7.1
-
Component/s: Elasticsearch
-
Epic Link:
-
Tags:
-
Sprint:Sprint RepoTeam 7.1-1, Sprint RepoTeam 7.1-2
When reindexing the repository we recursively index document starting from the root document.
Some documents have no parentid and are not reindexed (Tag, Tagging, DefaultRelation, versions ...)
Reindexing from a root docid is interesting to update a part of the repository,
but to reindex all the repository we should proceed differently.
New implementation:
A Scrolling worker get the list of document ids matching a NXQL query. This worker split the list in bucket and launch a Bucket worker.
The Bucket woker submit documents to Elasticsearch in bulk mode.
The default size of the bucket is 500, this can be tuned using elasticsearch.reindex.bucketReadSize
The default size of the number of document in the bulk command is 50, this can be tuned using elasticsearch.reindex.bucketWriteSize