Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-31423

Improve timeout detection on Kafka topic creation

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2023.0, 2021.29
    • Component/s: Streams
    • Release Notes Summary:
      Properly catch kafka admin request timeout
    • Tags:
    • Team:
      PLATFORM
    • Sprint:
      nxplatform #75
    • Story Points:
      2

      Description

      The Kafka admin client (used for creating topic) use timeouts provided by `request.timeout.ms` or `default.api.timeout.ms`.
      If the timeout occurs it is encapsulated in an ExecutionException, we also have an hardcoded of 5min on the nuxeo code.

      [2022-11-16T14:01:41.427Z] org.nuxeo.lib.stream.StreamRuntimeException: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: The request timed out.
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.lib.stream.log.kafka.KafkaUtils.createTopic(KafkaUtils.java:181) ~[nuxeo-stream-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.lib.stream.log.kafka.KafkaLogManager.create(KafkaLogManager.java:103) ~[nuxeo-stream-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.lib.stream.log.internals.AbstractLogManager.createIfNotExists(AbstractLogManager.java:75) ~[nuxeo-stream-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.lib.stream.log.UnifiedLogManager.createIfNotExists(UnifiedLogManager.java:122) ~[nuxeo-stream-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.runtime.stream.StreamServiceImpl.lambda$createLogIfNotExists$0(StreamServiceImpl.java:103) ~[nuxeo-runtime-stream-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at java.util.ArrayList.forEach(ArrayList.java:1511) ~[?:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.runtime.stream.StreamServiceImpl.createLogIfNotExists(StreamServiceImpl.java:101) ~[nuxeo-runtime-stream-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at java.util.ArrayList.forEach(ArrayList.java:1511) ~[?:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.runtime.stream.StreamServiceImpl.start(StreamServiceImpl.java:114) ~[nuxeo-runtime-stream-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.runtime.model.impl.RegistrationInfoImpl.start(RegistrationInfoImpl.java:372) ~[nuxeo-runtime-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.runtime.model.impl.ComponentManagerImpl.startComponent(ComponentManagerImpl.java:758) ~[nuxeo-runtime-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.runtime.model.impl.ComponentManagerImpl.startComponents(ComponentManagerImpl.java:740) ~[nuxeo-runtime-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.runtime.model.impl.ComponentManagerImpl.start(ComponentManagerImpl.java:841) ~[nuxeo-runtime-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.runtime.osgi.OSGiRuntimeService.startComponents(OSGiRuntimeService.java:470) ~[nuxeo-runtime-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.runtime.osgi.OSGiRuntimeService.frameworkEvent(OSGiRuntimeService.java:485) ~[nuxeo-runtime-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.osgi.OSGiAdapter.fireFrameworkEvent(OSGiAdapter.java:223) ~[nuxeo-runtime-osgi-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.runtime.test.RuntimeHarnessImpl.fireFrameworkStarted(RuntimeHarnessImpl.java:215) ~[nuxeo-runtime-test-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.runtime.test.runner.RuntimeFeature.beforeRun(RuntimeFeature.java:144) ~[nuxeo-runtime-test-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.runtime.test.runner.FeaturesRunner.lambda$beforeRun$2(FeaturesRunner.java:250) ~[nuxeo-runtime-test-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.runtime.test.runner.FeaturesRunner.apply(FeaturesRunner.java:300) [nuxeo-runtime-test-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.runtime.test.runner.FeaturesRunner.apply(FeaturesRunner.java:286) [nuxeo-runtime-test-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.runtime.test.runner.FeaturesRunner.beforeRun(FeaturesRunner.java:250) [nuxeo-runtime-test-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.runtime.test.runner.FeaturesRunner$BeforeClassStatement.evaluate(FeaturesRunner.java:384) [nuxeo-runtime-test-PR-608-2023.0-SNAPSHOT.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.junit.rules.RunRules.evaluate(RunRules.java:20) [junit-4.13.1.jar:4.13.1]
      [2022-11-16T14:01:41.427Z] 	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) [junit-4.13.1.jar:4.13.1]
      [2022-11-16T14:01:41.427Z] 	at org.junit.runners.ParentRunner.run(ParentRunner.java:413) [junit-4.13.1.jar:4.13.1]
      [2022-11-16T14:01:41.427Z] 	at org.junit.runners.Suite.runChild(Suite.java:128) [junit-4.13.1.jar:4.13.1]
      [2022-11-16T14:01:41.427Z] 	at org.junit.runners.Suite.runChild(Suite.java:27) [junit-4.13.1.jar:4.13.1]
      [2022-11-16T14:01:41.427Z] 	at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) [junit-4.13.1.jar:4.13.1]
      [2022-11-16T14:01:41.427Z] 	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) [junit-4.13.1.jar:4.13.1]
      [2022-11-16T14:01:41.427Z] 	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) [junit-4.13.1.jar:4.13.1]
      [2022-11-16T14:01:41.427Z] 	at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) [junit-4.13.1.jar:4.13.1]
      [2022-11-16T14:01:41.427Z] 	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) [junit-4.13.1.jar:4.13.1]
      [2022-11-16T14:01:41.427Z] 	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) [junit-4.13.1.jar:4.13.1]
      [2022-11-16T14:01:41.427Z] 	at org.junit.runners.ParentRunner.run(ParentRunner.java:413) [junit-4.13.1.jar:4.13.1]
      [2022-11-16T14:01:41.427Z] 	at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55) [surefire-junit47-2.22.2.jar:2.22.2]
      [2022-11-16T14:01:41.427Z] 	at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137) [surefire-junit47-2.22.2.jar:2.22.2]
      [2022-11-16T14:01:41.427Z] 	at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107) [surefire-junit47-2.22.2.jar:2.22.2]
      [2022-11-16T14:01:41.427Z] 	at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83) [surefire-junit47-2.22.2.jar:2.22.2]
      [2022-11-16T14:01:41.427Z] 	at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75) [surefire-junit47-2.22.2.jar:2.22.2]
      [2022-11-16T14:01:41.427Z] 	at org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:158) [surefire-junit47-2.22.2.jar:2.22.2]
      [2022-11-16T14:01:41.427Z] 	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) [surefire-booter-2.22.2.jar:2.22.2]
      [2022-11-16T14:01:41.427Z] 	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) [surefire-booter-2.22.2.jar:2.22.2]
      [2022-11-16T14:01:41.427Z] 	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) [surefire-booter-2.22.2.jar:2.22.2]
      [2022-11-16T14:01:41.427Z] 	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) [surefire-booter-2.22.2.jar:2.22.2]
      [2022-11-16T14:01:41.427Z] Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: The request timed out.
      [2022-11-16T14:01:41.427Z] 	at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45) ~[kafka-clients-2.6.0.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32) ~[kafka-clients-2.6.0.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:104) ~[kafka-clients-2.6.0.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:272) ~[kafka-clients-2.6.0.jar:?]
      [2022-11-16T14:01:41.427Z] 	at org.nuxeo.lib.stream.log.kafka.KafkaUtils.createTopic(KafkaUtils.java:173) ~[nuxeo-stream-PR-608-2023.0-SNAPSHOT.jar:?]
      

      Instead we should raise a StreamRuntimeException with proper message.

      Note that test on CI is using request.timeout.ms which is 10s and might not be enough depending on CI load:
      https://github.com/nuxeo/nuxeo-lts/blob/2021/modules/runtime/nuxeo-runtime-stream/src/test/resources/OSGI-INF/test-stream-kafka-contrib.xml#L8

            <admin>
              <property name="bootstrap.servers">${nuxeo.test.kafka.servers}</property>
              <property name="request.timeout.ms">10000</property>
            </admin>
      

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: