Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-31716

Reset start time when the orphan binary GC crashes before the end

    XMLWordPrintable

    Details

    • Backlog priority:
      300
    • Sprint:
      nxsupport 16
    • Story Points:
      1

      Description

      If the GC crashes with an error like below

      Caused by: com.mongodb.MongoCursorNotFoundException: Query failed with error code -5 and error message 'Cursor 8319385976274736211 not found on server prod-xxx-ged-mdb-101:27017' on server prod-xxx-ged-mdb-101:27017
      	at com.mongodb.operation.QueryHelper.translateCommandException(QueryHelper.java:27) ~[mongo-java-driver-3.12.1.jar:?]
      	at com.mongodb.operation.QueryBatchCursor.getMore(QueryBatchCursor.java:267) ~[mongo-java-driver-3.12.1.jar:?]
      	at com.mongodb.operation.QueryBatchCursor.hasNext(QueryBatchCursor.java:138) ~[mongo-java-driver-3.12.1.jar:?]
      	at com.mongodb.client.internal.MongoBatchCursorAdapter.hasNext(MongoBatchCursorAdapter.java:54) ~[mongo-java-driver-3.12.1.jar:?]
      	at com.mongodb.client.internal.Java8ForEachHelper.forEach(Java8ForEachHelper.java:29) ~[mongo-java-driver-3.12.1.jar:?]
      	at com.mongodb.client.internal.Java8FindIterableImpl.forEach(Java8FindIterableImpl.java:40) ~[mongo-java-driver-3.12.1.jar:?]
      	at org.nuxeo.ecm.core.storage.mongodb.MongoDBRepository.markReferencedBinaries(MongoDBRepository.java:1060) ~[nuxeo-core-storage-mongodb-10.10-HF65.jar:?]
      	at org.nuxeo.ecm.core.storage.dbs.DBSCachingRepository.markReferencedBinaries(DBSCachingRepository.java:449) ~[nuxeo-core-storage-dbs-10.10-HF65.jar:?]
      	at org.nuxeo.ecm.core.blob.DocumentBlobManagerComponent.lambda$garbageCollectBinaries$3(DocumentBlobManagerComponent.java:420) ~[nuxeo-core-10.10-HF65.jar:?]
      	at org.nuxeo.runtime.transaction.TransactionHelper.runInTransaction(TransactionHelper.java:667) ~[nuxeo-runtime-jtajca-10.10-HF65.jar:?]
      	at org.nuxeo.ecm.core.blob.DocumentBlobManagerComponent.runInTransaction(DocumentBlobManagerComponent.java:454) ~[nuxeo-core-10.10-HF65.jar:?]
      	at org.nuxeo.ecm.core.blob.DocumentBlobManagerComponent.garbageCollectBinaries(DocumentBlobManagerComponent.java:405) ~[nuxeo-core-10.10-HF65.jar:?]
      

      When running the GC again, the value of startTime is not null (because it's onlyto 0 when it ends properly). The direct consequence of this behavior is that the treatment systematically falls into check with an error message "Already Started" as soon as a previous treatment has failed, even if no treatment is currently in progress.

      https://github.com/nuxeo/nuxeo/blob/release-10.10-HF60/nuxeo-core/nuxeo-core-api/src/main/java/org/nuxeo/ecm/core/blob/binary/LocalBinaryManager.java#L275

      The code should be included in a try/finally clause which will reset startTime if the GC is still in progress at the end.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tmartins Thierry Martins
                Reporter:
                tmartins Thierry Martins
                Participants:
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 1 hour
                  1h