-
Type: Bug
-
Status: Resolved
-
Priority: Major
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: 2021.17
-
Component/s: S3, TransientStore
-
Tags:
-
Team:PLATFORM
-
Sprint:nxplatform #56, nxplatform #57
-
Story Points:3
Seen in production, transaction timeout followed by a Kafka poll interval stream failure:
common: Exception in processLoop: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms,.... Skip Work in failure: id: 799003380948.1989383214, title: Transient Store GC, commonPool-02,in:18,inCheckpoint:18,out:0,lastRead:1637301575188,lastTimer:0,wm:214604370868699137,loop:2698,rebalance assigned org.nuxeo.runtime.transaction.TransactionRuntimeException: Unable to commit: Transaction timeout at org.nuxeo.runtime.transaction.TransactionHelper.commitOrRollbackTransaction(TransactionHelper.java:439) ~[nuxeo-runtime-jtajca-10.10-HF41.jar:?] at org.nuxeo.ecm.core.work.AbstractWork.runWorkWithTransaction(AbstractWork.java:522) ~[nuxeo-core-event-10.10-HF44.jar:?] at org.nuxeo.ecm.core.work.AbstractWork.run(AbstractWork.java:383) ~[nuxeo-core-event-10.10-HF44.jar:?] Caused by: javax.transaction.RollbackException: Unable to commit: Transaction timeout
TransiantStorageGCWork should be timeboxed there is no point in trying to process everything at one time.