Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-27901

Aspera complete worker timeout with big files

    XMLWordPrintable

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: ADDONS_10.10
    • Component/s: Aspera Connector

      Description

      A test has been executed with a 140GB file uploaded via Aspera to a Nuxeo transient store (here an S3 bucket).

      When creating the document afterward, this worker timeout here

      2019-08-14T20:54:35,955 WARN  [JvmGcMonitorService] [gc][787273] overhead, spent [848ms] collecting in the last [1.4s]
      2019-08-15T05:54:07,654 ERROR [WorkManagerImpl] Uncaught error on thread: Nuxeo-Work-asperaCompletion-1, current work might be lost, WorkManager metrics might be corrupted.
      org.nuxeo.ecm.core.api.NuxeoException: Work failed after 0 retries, class=class com.nuxeo.aspera.connector.service.AsperaCompleteWork id=transfer:211f0955-47cf-4b54-a656-a0e77211d108:file:0: category=asperaCompletion title=Aspera Completion
          at org.nuxeo.ecm.core.work.AbstractWork.workFailed(AbstractWork.java:439) ~[nuxeo-core-event-10.10-HF10.jar:?]
          at org.nuxeo.ecm.core.work.AbstractWork.run(AbstractWork.java:395) ~[nuxeo-core-event-10.10-HF10.jar:?]
          at org.nuxeo.ecm.core.work.WorkHolder.run(WorkHolder.java:57) ~[nuxeo-core-event-10.10-HF10.jar:?]
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_212]
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_212]
          at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212]
      Caused by: org.nuxeo.runtime.transaction.TransactionRuntimeException: Transaction has timed out
          at org.nuxeo.runtime.transaction.TransactionHelper.checkTransactionTimeout(TransactionHelper.java:223) ~[nuxeo-runtime-jtajca-10.10.jar:?]
          at org.nuxeo.ecm.core.api.local.LocalSession.getSession(LocalSession.java:108) ~[nuxeo-core-10.10-HF08.jar:?]
          at org.nuxeo.ecm.core.api.AbstractSession.resolveReference(AbstractSession.java:332) ~[nuxeo-core-10.10-HF08.jar:?]
          at org.nuxeo.ecm.core.api.AbstractSession.saveDocument(AbstractSession.java:1501) ~[nuxeo-core-10.10-HF08.jar:?]
          at com.nuxeo.aspera.connector.adapter.Transfer.save(Transfer.java:248) ~[nuxeo-aspera-core-1.0.2-SNAPSHOT.jar:?]
          at com.nuxeo.aspera.connector.adapter.Transfer.save(Transfer.java:85) ~[nuxeo-aspera-core-1.0.2-SNAPSHOT.jar:?]
          at com.nuxeo.aspera.connector.service.AsperaCompleteWork.work(AsperaCompleteWork.java:89) ~[nuxeo-aspera-core-1.0.2-SNAPSHOT.jar:?]
          at org.nuxeo.ecm.core.work.AbstractWork.runWorkWithTransaction(AbstractWork.java:493) ~[nuxeo-core-event-10.10-HF10.jar:?]
          at org.nuxeo.ecm.core.work.AbstractWork.run(AbstractWork.java:383) ~[nuxeo-core-event-10.10-HF10.jar:?]
      

      Maybe it's due to the copy of the blob between the transient bucket to the main bucket:

      • We should increase the transaction timeout (workaround)
      • We should optimise the worker code?
      • We should update the transient store manager itself to prevent those timeout in case of a big blob?
      • Something else?

        Attachments

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0 minutes
                0m
                Logged:
                Time Spent - 1 hour
                1h