-
Type: Task
-
Status: Resolved
-
Priority: Minor
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: ADDONS_10.10
-
Component/s: Aspera Connector
-
Epic Link:
-
Tags:
-
Sprint:NOS 11.1.16 - 2019-08 2, NOS 11.1.17 - 2019-09 1
A test has been executed with a 140GB file uploaded via Aspera to a Nuxeo transient store (here an S3 bucket).
When creating the document afterward, this worker timeout here
2019-08-14T20:54:35,955 WARN [JvmGcMonitorService] [gc][787273] overhead, spent [848ms] collecting in the last [1.4s] 2019-08-15T05:54:07,654 ERROR [WorkManagerImpl] Uncaught error on thread: Nuxeo-Work-asperaCompletion-1, current work might be lost, WorkManager metrics might be corrupted. org.nuxeo.ecm.core.api.NuxeoException: Work failed after 0 retries, class=class com.nuxeo.aspera.connector.service.AsperaCompleteWork id=transfer:211f0955-47cf-4b54-a656-a0e77211d108:file:0: category=asperaCompletion title=Aspera Completion at org.nuxeo.ecm.core.work.AbstractWork.workFailed(AbstractWork.java:439) ~[nuxeo-core-event-10.10-HF10.jar:?] at org.nuxeo.ecm.core.work.AbstractWork.run(AbstractWork.java:395) ~[nuxeo-core-event-10.10-HF10.jar:?] at org.nuxeo.ecm.core.work.WorkHolder.run(WorkHolder.java:57) ~[nuxeo-core-event-10.10-HF10.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_212] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_212] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212] Caused by: org.nuxeo.runtime.transaction.TransactionRuntimeException: Transaction has timed out at org.nuxeo.runtime.transaction.TransactionHelper.checkTransactionTimeout(TransactionHelper.java:223) ~[nuxeo-runtime-jtajca-10.10.jar:?] at org.nuxeo.ecm.core.api.local.LocalSession.getSession(LocalSession.java:108) ~[nuxeo-core-10.10-HF08.jar:?] at org.nuxeo.ecm.core.api.AbstractSession.resolveReference(AbstractSession.java:332) ~[nuxeo-core-10.10-HF08.jar:?] at org.nuxeo.ecm.core.api.AbstractSession.saveDocument(AbstractSession.java:1501) ~[nuxeo-core-10.10-HF08.jar:?] at com.nuxeo.aspera.connector.adapter.Transfer.save(Transfer.java:248) ~[nuxeo-aspera-core-1.0.2-SNAPSHOT.jar:?] at com.nuxeo.aspera.connector.adapter.Transfer.save(Transfer.java:85) ~[nuxeo-aspera-core-1.0.2-SNAPSHOT.jar:?] at com.nuxeo.aspera.connector.service.AsperaCompleteWork.work(AsperaCompleteWork.java:89) ~[nuxeo-aspera-core-1.0.2-SNAPSHOT.jar:?] at org.nuxeo.ecm.core.work.AbstractWork.runWorkWithTransaction(AbstractWork.java:493) ~[nuxeo-core-event-10.10-HF10.jar:?] at org.nuxeo.ecm.core.work.AbstractWork.run(AbstractWork.java:383) ~[nuxeo-core-event-10.10-HF10.jar:?]
Maybe it's due to the copy of the blob between the transient bucket to the main bucket:
- We should increase the transaction timeout (workaround)
- We should optimise the worker code?
- We should update the transient store manager itself to prevent those timeout in case of a big blob?
- Something else?