-
Type: Bug
-
Status: In Progress
-
Priority: Minor
-
Resolution: Unresolved
-
Affects Version/s: 10.10
-
Fix Version/s: 10.10-HF73, 2021.x, 2023.x
-
Component/s: BlobManager
If the GC crashes with an error like below
Caused by: com.mongodb.MongoCursorNotFoundException: Query failed with error code -5 and error message 'Cursor 8319385976274736211 not found on server prod-xxx-ged-mdb-101:27017' on server prod-xxx-ged-mdb-101:27017 at com.mongodb.operation.QueryHelper.translateCommandException(QueryHelper.java:27) ~[mongo-java-driver-3.12.1.jar:?] at com.mongodb.operation.QueryBatchCursor.getMore(QueryBatchCursor.java:267) ~[mongo-java-driver-3.12.1.jar:?] at com.mongodb.operation.QueryBatchCursor.hasNext(QueryBatchCursor.java:138) ~[mongo-java-driver-3.12.1.jar:?] at com.mongodb.client.internal.MongoBatchCursorAdapter.hasNext(MongoBatchCursorAdapter.java:54) ~[mongo-java-driver-3.12.1.jar:?] at com.mongodb.client.internal.Java8ForEachHelper.forEach(Java8ForEachHelper.java:29) ~[mongo-java-driver-3.12.1.jar:?] at com.mongodb.client.internal.Java8FindIterableImpl.forEach(Java8FindIterableImpl.java:40) ~[mongo-java-driver-3.12.1.jar:?] at org.nuxeo.ecm.core.storage.mongodb.MongoDBRepository.markReferencedBinaries(MongoDBRepository.java:1060) ~[nuxeo-core-storage-mongodb-10.10-HF65.jar:?] at org.nuxeo.ecm.core.storage.dbs.DBSCachingRepository.markReferencedBinaries(DBSCachingRepository.java:449) ~[nuxeo-core-storage-dbs-10.10-HF65.jar:?] at org.nuxeo.ecm.core.blob.DocumentBlobManagerComponent.lambda$garbageCollectBinaries$3(DocumentBlobManagerComponent.java:420) ~[nuxeo-core-10.10-HF65.jar:?] at org.nuxeo.runtime.transaction.TransactionHelper.runInTransaction(TransactionHelper.java:667) ~[nuxeo-runtime-jtajca-10.10-HF65.jar:?] at org.nuxeo.ecm.core.blob.DocumentBlobManagerComponent.runInTransaction(DocumentBlobManagerComponent.java:454) ~[nuxeo-core-10.10-HF65.jar:?] at org.nuxeo.ecm.core.blob.DocumentBlobManagerComponent.garbageCollectBinaries(DocumentBlobManagerComponent.java:405) ~[nuxeo-core-10.10-HF65.jar:?]
When running the GC again, the value of startTime is not null (because it's onlyto 0 when it ends properly). The direct consequence of this behavior is that the treatment systematically falls into check with an error message "Already Started" as soon as a previous treatment has failed, even if no treatment is currently in progress.
The code should be included in a try/finally clause which will reset startTime if the GC is still in progress at the end.
- links to