-
Type: Bug
-
Status: Resolved
-
Priority: Minor
-
Resolution: Won't Fix
-
Affects Version/s: 10.10
-
Fix Version/s: None
-
Component/s: Bulk, Elasticsearch
-
Tags:
-
Backlog priority:800
-
Sprint:nxplatform #70
-
Story Points:5
Currently, a single Document with corrupted data is enough to halt the Elasticsearch.BulkIndex operation, requiring manual recovery steps to proceed. Bad records should be skipped (with some basic information / UUID logged for reference) such that the majority of Documents can be re-indexed properly.
Steps to Reproduce:
- Set up a Nuxeo instance with MongoDB backend, with with several Documents created and indexed (e.g. using the nuxeo-showcase-content addon)
- In Mongo, corrupt a schema property of a Document - for example, change value of dc:modified to a String type object.
- Attempt repository re-indexing using the Elasticsearch.BulkIndex operation.
Expected behavior: indexing of Documents with bad data are skipped, with basic info / UUID logged for follow-up troubleshooting, allowing for the rest of the repository to be re-indexed.
Actual behavior: computation failures from bad records prevent the rest of the operation from proceeding, resulting in large amounts of unindexed Documents and requiring manual recovery.