Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-22795

Fix random nuxeo-mqueues TestPatternQueuingChronicle hangs

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 9.3-SNAPSHOT
    • Fix Version/s: 9.3
    • Component/s: Events / Works

      Description

      The test execution of org.nuxeo.ecm.platform.importer.mqueues.tests.pattern.TestPatternQueuingChronicle hangs for more than 2hours resulting in CI job time out.

      02:54:04 02:54:04,685 [Nuxeo-ConsumerPool-00] WARN  [ConsumerPool] Consumers status: threads: 10, failure 0, messages committed: 15000, elapsed: 5.17s, throughput: 2901.92 msg/s
      02:54:04 02:54:04,696 [main-SendThread(localhost:2181)] WARN  [ClientCnxn$SendThread] Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
      02:54:04 java.net.ConnectException: Connection refused
      02:54:04 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
      02:54:04 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
      02:54:04 	at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
      02:54:04 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
      02:54:04 02:54:04,798 [main-SendThread(localhost:2181)] WARN  [ClientCnxn$SendThread] Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
      02:54:04 java.net.ConnectException: Connection refused
      02:54:04 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
      02:54:04 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
      02:54:04 	at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
      02:54:04 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
      02:54:06 Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 23.871 sec - in org.nuxeo.ecm.platform.importer.mqueues.tests.pattern.TestPatternBoundedQueuingChronicle
      02:54:06 Running org.nuxeo.ecm.platform.importer.mqueues.tests.pattern.TestPatternQueuingKafka
      02:54:06 Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0 sec - in org.nuxeo.ecm.platform.importer.mqueues.tests.pattern.TestPatternQueuingKafka
      02:54:06 Running org.nuxeo.ecm.platform.importer.mqueues.tests.pattern.TestPatternQueuingChronicle
      02:54:06 02:54:06,004 [Nuxeo-ConsumerPool-00] WARN  [AbstractCallablePool] Start Nuxeo-Consumer Pool on 1 thread(s).
      02:54:06 02:54:06,010 [Nuxeo-ConsumerPool-00] WARN  [ConsumerPool] Consumers status: threads: 1, failure 0, messages committed: 1, elapsed: 0.00s, throughput: 1000.00 msg/s
      02:54:06 02:54:06,019 [Nuxeo-ConsumerPool-00] WARN  [AbstractCallablePool] Start Nuxeo-Consumer Pool on 2 thread(s).
      02:54:06 02:54:06,023 [main] WARN  [TestPatternQueuing] Close the MQManager (errors expected)
      02:54:06 02:54:06,123 [Nuxeo-Consumer-00-1-batch now] ERROR [AbstractCallablePool] Exception catch in runner: The tailer has been closed.
      02:54:06 java.lang.IllegalStateException: The tailer has been closed.
      02:54:06 	at org.nuxeo.ecm.platform.importer.mqueues.mqueues.chronicle.ChronicleMQTailer.read(ChronicleMQTailer.java:98)
      02:54:06 	at org.nuxeo.ecm.platform.importer.mqueues.mqueues.chronicle.ChronicleMQTailer.read(ChronicleMQTailer.java:89)
      02:54:06 	at org.nuxeo.ecm.platform.importer.mqueues.pattern.consumer.internals.ConsumerRunner.acceptBatch(ConsumerRunner.java:265)
      02:54:06 	at org.nuxeo.ecm.platform.importer.mqueues.pattern.consumer.internals.ConsumerRunner.processBatch(ConsumerRunner.java:222)
      02:54:06 	at org.nuxeo.ecm.platform.importer.mqueues.pattern.consumer.internals.ConsumerRunner.processBatchWithRetry(ConsumerRunner.java:185)
      02:54:06 	at org.nuxeo.ecm.platform.importer.mqueues.pattern.consumer.internals.ConsumerRunner.consumerLoop(ConsumerRunner.java:169)
      02:54:06 	at org.nuxeo.ecm.platform.importer.mqueues.pattern.consumer.internals.ConsumerRunner.call(ConsumerRunner.java:120)
      02:54:06 	at org.nuxeo.ecm.platform.importer.mqueues.pattern.consumer.internals.ConsumerRunner.call(ConsumerRunner.java:57)
      02:54:06 	at org.nuxeo.ecm.platform.importer.mqueues.pattern.consumer.internals.AbstractCallablePool.lambda$runPool$1(AbstractCallablePool.java:91)
      02:54:06 	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
      02:54:06 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      02:54:06 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      02:54:06 	at java.lang.Thread.run(Thread.java:748)
      02:54:06 02:54:06,124 [Nuxeo-ConsumerPool-00] ERROR [AbstractCallablePool] End of consumer in error: java.lang.IllegalStateException: The tailer has been closed.java.util.concurrent.CompletableFuture@1e1b3677[Completed exceptionally]
      06:26:20 Build timed out (after 300 minutes). Marking the build as aborted.
      

      Seen only in:
      https://qa.nuxeo.org/jenkins/job/Deploy/job/IT-nuxeo-master-build/533/console

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 2 days
                  2d