[NXP-24994] Don't crash Elasticsearch indexing when blob is missing - Nuxeo Issue Tracker

XML

Word

Printable

Details

Type: Improvement
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 9.10
Fix Version/s: 9.10-HF11, 10.2
Component/s: Elasticsearch

Release Notes Summary:
The missing blobs are ignored when running an Elasticsearch indexing.
Tags:
- SupCom
- nxAI
Backlog priority:
800
Sprint:
nxAI Sprint 10.2.6, nxAI Sprint 10.2.7
Story Points:
3

Description

The most common failure is due to missing blobs like

2018-04-19 16:52:55,756 ERROR [Nuxeo-Work-elasticSearchIndexing-9:1403013440772714.559777516] [org.nuxeo.ecm.core.work.AbstractWork] Exception during work: BucketIndexingWorker(333a3d49-719c-4ce9-8f6e-42c5e594d2a8..., /elasticSearchIndexing:1403012779396454.110412313, Progress(?%, ?/0), null)
org.nuxeo.ecm.core.api.PropertyException: Cannot get blob info for: ee008588d6d4e088ee4ce541d89fea7a6
	at org.nuxeo.ecm.core.storage.BaseDocument.getValueBlob(BaseDocument.java:484)
	at org.nuxeo.ecm.core.storage.BaseDocument.readComplexProperty(BaseDocument.java:666)
	at org.nuxeo.ecm.core.storage.BaseDocument.readComplexProperty(BaseDocument.java:681)
	at org.nuxeo.ecm.core.storage.sql.coremodel.SQLDocumentLive.readDocumentPart(SQLDocumentLive.java:172)
	at org.nuxeo.ecm.core.api.DocumentModelFactory.createDataModel(DocumentModelFactory.java:209)
	at org.nuxeo.ecm.core.api.AbstractSession.getDataModel(AbstractSession.java:2007)
	at org.nuxeo.ecm.core.api.impl.DocumentModelImpl.loadDataModel(DocumentModelImpl.java:438)
	at org.nuxeo.ecm.core.api.impl.DocumentModelImpl.getDataModel(DocumentModelImpl.java:448)
	at org.nuxeo.ecm.core.api.impl.DocumentModelImpl.getPart(DocumentModelImpl.java:1211)
	at org.nuxeo.ecm.core.api.impl.DocumentModelImpl.getPropertyObjects(DocumentModelImpl.java:1237)
	at org.nuxeo.ecm.automation.jaxrs.io.documents.JsonESDocumentWriter.writeProperties(JsonESDocumentWriter.java:241)
	at org.nuxeo.ecm.automation.jaxrs.io.documents.JsonESDocumentWriter.writeSchemas(JsonESDocumentWriter.java:213)
	at org.nuxeo.ecm.automation.jaxrs.io.documents.JsonESDocumentWriter.writeDoc(JsonESDocumentWriter.java:109)
	at org.nuxeo.ecm.automation.jaxrs.io.documents.JsonESDocumentWriter.writeESDocument(JsonESDocumentWriter.java:236)
	at org.nuxeo.elasticsearch.core.ElasticSearchIndexingImpl.buildEsIndexingRequest(ElasticSearchIndexingImpl.java:411)
	at org.nuxeo.elasticsearch.core.ElasticSearchIndexingImpl.processBulkIndexCommands(ElasticSearchIndexingImpl.java:176)
	at org.nuxeo.elasticsearch.core.ElasticSearchIndexingImpl.indexNonRecursive(ElasticSearchIndexingImpl.java:145)
	at org.nuxeo.elasticsearch.ElasticSearchComponent.indexNonRecursive(ElasticSearchComponent.java:405)
	at org.nuxeo.elasticsearch.work.BucketIndexingWorker.doWork(BucketIndexingWorker.java:78)
	at org.nuxeo.elasticsearch.work.BaseIndexingWorker.work(BaseIndexingWorker.java:48)
	at org.nuxeo.ecm.core.work.AbstractWork.runWorkWithTransaction(AbstractWork.java:435)
	at org.nuxeo.ecm.core.work.AbstractWork.run(AbstractWork.java:355)
	at org.nuxeo.ecm.core.work.WorkHolder.run(WorkHolder.java:57)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Unknown binary: ee008588d6d4e088ee4ce541d89fea7a6
	at org.nuxeo.ecm.core.blob.binary.BinaryBlobProvider.readBlob(BinaryBlobProvider.java:100)
	at org.nuxeo.ecm.core.blob.DocumentBlobManagerComponent.readBlob(DocumentBlobManagerComponent.java:132)
	at org.nuxeo.ecm.core.storage.BaseDocument.getValueBlob(BaseDocument.java:482)
	... 25 more

This error will stop the indexing.

Even if it is the symptom of inconsistent data, some users don't care and want that the indexing continues with the remaining document and finishes properly.

Therefore it should be possible to:
1) log the current failure and its cause
2) continue the indexing

As a side note, some failures are already handled:

missing document
incorrect indexing command

Attachments

Activity

People

Assignee:

Gethin James

Reporter:

Thierry Martins

Participants:

Benoit Delbosc, Gethin James, Jenkins, Thierry Martins

Votes:

0 Vote for this issue

Watchers:

3 Start watching this issue

Dates

Created:

2018-05-11 12:18

Updated:

2018-06-26 07:29

Resolved:

2018-06-26 07:29

Time Tracking

Estimated:

Remaining:

Logged: