Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-27559

Chronicle-Queue unit tests randomly failing in Jenkins X

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 11.3, 2021.0
    • Component/s: CI/CD
    • Team:
      PLATFORM
    • Sprint:
      nxplatform 11.1.15, nxplatform 11.1.23, nxplatform 11.1.25, nxplatform 11.1.24, nxplatform 11.1.26, nxplatform #16, nxplatform #17
    • Story Points:
      1

      Description

      See http://jenkins.platform.34.74.59.50.nip.io/job/nuxeo/job/nuxeo/job/task-NXP-27430-platorm-simple-pipeline/50/consoleFull

      [ERROR] Tests run: 25, Failures: 0, Errors: 1, Skipped: 1, Time elapsed: 55.791 s <<< FAILURE! - in org.nuxeo.lib.stream.tests.computation.TestLogStreamProcessorChronicle
      [ERROR] testComplexTopoManyRecords(org.nuxeo.lib.stream.tests.computation.TestLogStreamProcessorChronicle)  Time elapsed: 0.333 s  <<< ERROR!
      java.lang.IllegalStateException: NOT_REACHED index: 468a00000000
      	at net.openhft.chronicle.queue.impl.single.SingleChronicleQueueExcerpts$StoreTailer.originalToEnd(SingleChronicleQueueExcerpts.java:1642)
      	at net.openhft.chronicle.queue.impl.single.SingleChronicleQueueExcerpts$StoreTailer.toEnd(SingleChronicleQueueExcerpts.java:1548)
      	at org.nuxeo.lib.stream.log.chronicle.ChronicleLogOffsetTracker.readLastCommittedOffset(ChronicleLogOffsetTracker.java:123)
      	at org.nuxeo.lib.stream.log.chronicle.ChronicleLogOffsetTracker.getLastCommittedOffset(ChronicleLogOffsetTracker.java:109)
      	at org.nuxeo.lib.stream.log.chronicle.ChronicleLogTailer.toLastCommitted(ChronicleLogTailer.java:170)
      	at org.nuxeo.lib.stream.log.chronicle.ChronicleLogTailer.<init>(ChronicleLogTailer.java:82)
      	at org.nuxeo.lib.stream.log.chronicle.ChronicleLogAppender.createTailer(ChronicleLogAppender.java:315)
      	at org.nuxeo.lib.stream.log.chronicle.ChronicleLogManager.lambda$doCreateTailer$3(ChronicleLogManager.java:215)
      	at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
      	at org.nuxeo.lib.stream.log.chronicle.ChronicleLogManager.doCreateTailer(ChronicleLogManager.java:214)
      	at org.nuxeo.lib.stream.log.internals.AbstractLogManager.createTailer(AbstractLogManager.java:96)
      	at org.nuxeo.lib.stream.computation.log.LogStreamManager.createTailer(LogStreamManager.java:117)
      	at org.nuxeo.lib.stream.computation.log.ComputationRunner.<init>(ComputationRunner.java:113)
      	at org.nuxeo.lib.stream.computation.log.ComputationPool.lambda$start$0(ComputationPool.java:88)
      	at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
      	at org.nuxeo.lib.stream.computation.log.ComputationPool.start(ComputationPool.java:87)
      	at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
      	at org.nuxeo.lib.stream.computation.log.LogStreamProcessor.start(LogStreamProcessor.java:97)
      	at org.nuxeo.lib.stream.tests.computation.TestStreamProcessor.testComplexTopo(TestStreamProcessor.java:239)
      	at org.nuxeo.lib.stream.tests.computation.TestStreamProcessor.testComplexTopoManyRecords(TestStreamProcessor.java:286)
      	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
      	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
      	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
      	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
      	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
      	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
      	at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
      	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
      	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
      	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
      	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
      	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
      	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
      	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
      	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
      	at org.junit.runners.Suite.runChild(Suite.java:128)
      	at org.junit.runners.Suite.runChild(Suite.java:27)
      	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
      	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
      	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
      	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
      	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
      	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
      	at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55)
      	at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137)
      	at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107)
      	at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83)
      	at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75)
      	at org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:158)
      	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
      	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
      	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
      	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
      
      [INFO] Running org.nuxeo.lib.stream.tests.computation.TestWatermark
      [WARNING] Tests run: 11, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0.001 s - in org.nuxeo.lib.stream.tests.log.TestLogKafka
      [INFO] Running org.nuxeo.lib.stream.tests.log.TestLogChronicle
      23:04:27,757 [C4Pool-03,in:53,inCheckpoint:53,out:53,lastRead:1560294267657,lastTimer:0,wm:204510890103144449,loop:52,checkpoint] ERROR [ComputationRunner] C4: Exception in processLoop: The tailer has been closed.
      java.lang.IllegalStateException: The tailer has been closed.
      	at org.nuxeo.lib.stream.log.chronicle.ChronicleLogTailer.read(ChronicleLogTailer.java:114) ~[classes/:?]
      	at org.nuxeo.lib.stream.log.chronicle.ChronicleLogTailer.read(ChronicleLogTailer.java:97) ~[classes/:?]
      	at org.nuxeo.lib.stream.computation.log.ComputationRunner.processRecord(ComputationRunner.java:276) ~[classes/:?]
      	at org.nuxeo.lib.stream.computation.log.ComputationRunner.processLoop(ComputationRunner.java:189) ~[classes/:?]
      	at org.nuxeo.lib.stream.computation.log.ComputationRunner.run(ComputationRunner.java:148) [classes/:?]
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
      	at java.lang.Thread.run(Thread.java:834) [?:?]
      

      According to Benoit Delbosc:

      It's a Chronicle-Queue issue (SCQIndexing) that we have on QA from times to times, see NXP-26353.
      Happens when the slaves are loaded or the FS is in trouble.
      In a Docker environment, the /tmp directory should be on a volume.
      Currently for this Jenkins X build it is in the "maven" container itself.
      So it's using Docker overlay, see https://docs.docker.com/storage/storagedriver/select-storage-driver/, which doesn't get along well with Chronicle-Queue.
      Moreover, using a storage driver for /tmp is likely to slow down the whole process.

      I've tried to mount /tmp on an EmptyDir Kubernetes volume, seeĀ https://github.com/nuxeo/jx-gcp/blob/task-NXP-27503-configure-maven-java11-nuxeo-pod-template/teams/platform/myvalues.yaml#L18, but the build still randomly fails.

        Attachments

        1. image-2019-06-12-10-21-31-878.png
          51 kB
          Benoit Delbosc
        2. image-2019-06-12-10-23-44-008.png
          73 kB
          Benoit Delbosc
        3. image-2019-06-12-10-24-35-210.png
          73 kB
          Benoit Delbosc
        4. image-2019-06-12-10-35-02-911.png
          75 kB
          Benoit Delbosc
        5. TestBlobImportChronicle.log
          42 kB
          Antoine Taillefer
        6. TestLogStreamProcessorChronicle.testComplexTopoManyRecords.txt
          36 kB
          Antoine Taillefer

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 0 minutes
                  0m
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 2 days, 2 hours, 30 minutes
                  2d 2h 30m