-
Type: Bug
-
Status: Resolved
-
Priority: Minor
-
Resolution: Fixed
-
Affects Version/s: None
-
Component/s: Streams
-
Release Notes Summary:Properly catch kafka admin request timeout
-
Tags:
-
Team:PLATFORM
-
Sprint:nxplatform #75
-
Story Points:2
The Kafka admin client (used for creating topic) use timeouts provided by `request.timeout.ms` or `default.api.timeout.ms`.
If the timeout occurs it is encapsulated in an ExecutionException, we also have an hardcoded of 5min on the nuxeo code.
[2022-11-16T14:01:41.427Z] org.nuxeo.lib.stream.StreamRuntimeException: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: The request timed out. [2022-11-16T14:01:41.427Z] at org.nuxeo.lib.stream.log.kafka.KafkaUtils.createTopic(KafkaUtils.java:181) ~[nuxeo-stream-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.lib.stream.log.kafka.KafkaLogManager.create(KafkaLogManager.java:103) ~[nuxeo-stream-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.lib.stream.log.internals.AbstractLogManager.createIfNotExists(AbstractLogManager.java:75) ~[nuxeo-stream-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.lib.stream.log.UnifiedLogManager.createIfNotExists(UnifiedLogManager.java:122) ~[nuxeo-stream-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.runtime.stream.StreamServiceImpl.lambda$createLogIfNotExists$0(StreamServiceImpl.java:103) ~[nuxeo-runtime-stream-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at java.util.ArrayList.forEach(ArrayList.java:1511) ~[?:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.runtime.stream.StreamServiceImpl.createLogIfNotExists(StreamServiceImpl.java:101) ~[nuxeo-runtime-stream-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at java.util.ArrayList.forEach(ArrayList.java:1511) ~[?:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.runtime.stream.StreamServiceImpl.start(StreamServiceImpl.java:114) ~[nuxeo-runtime-stream-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.runtime.model.impl.RegistrationInfoImpl.start(RegistrationInfoImpl.java:372) ~[nuxeo-runtime-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.runtime.model.impl.ComponentManagerImpl.startComponent(ComponentManagerImpl.java:758) ~[nuxeo-runtime-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.runtime.model.impl.ComponentManagerImpl.startComponents(ComponentManagerImpl.java:740) ~[nuxeo-runtime-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.runtime.model.impl.ComponentManagerImpl.start(ComponentManagerImpl.java:841) ~[nuxeo-runtime-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.runtime.osgi.OSGiRuntimeService.startComponents(OSGiRuntimeService.java:470) ~[nuxeo-runtime-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.runtime.osgi.OSGiRuntimeService.frameworkEvent(OSGiRuntimeService.java:485) ~[nuxeo-runtime-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.osgi.OSGiAdapter.fireFrameworkEvent(OSGiAdapter.java:223) ~[nuxeo-runtime-osgi-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.runtime.test.RuntimeHarnessImpl.fireFrameworkStarted(RuntimeHarnessImpl.java:215) ~[nuxeo-runtime-test-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.runtime.test.runner.RuntimeFeature.beforeRun(RuntimeFeature.java:144) ~[nuxeo-runtime-test-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.runtime.test.runner.FeaturesRunner.lambda$beforeRun$2(FeaturesRunner.java:250) ~[nuxeo-runtime-test-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.runtime.test.runner.FeaturesRunner.apply(FeaturesRunner.java:300) [nuxeo-runtime-test-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.runtime.test.runner.FeaturesRunner.apply(FeaturesRunner.java:286) [nuxeo-runtime-test-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.runtime.test.runner.FeaturesRunner.beforeRun(FeaturesRunner.java:250) [nuxeo-runtime-test-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.runtime.test.runner.FeaturesRunner$BeforeClassStatement.evaluate(FeaturesRunner.java:384) [nuxeo-runtime-test-PR-608-2023.0-SNAPSHOT.jar:?] [2022-11-16T14:01:41.427Z] at org.junit.rules.RunRules.evaluate(RunRules.java:20) [junit-4.13.1.jar:4.13.1] [2022-11-16T14:01:41.427Z] at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) [junit-4.13.1.jar:4.13.1] [2022-11-16T14:01:41.427Z] at org.junit.runners.ParentRunner.run(ParentRunner.java:413) [junit-4.13.1.jar:4.13.1] [2022-11-16T14:01:41.427Z] at org.junit.runners.Suite.runChild(Suite.java:128) [junit-4.13.1.jar:4.13.1] [2022-11-16T14:01:41.427Z] at org.junit.runners.Suite.runChild(Suite.java:27) [junit-4.13.1.jar:4.13.1] [2022-11-16T14:01:41.427Z] at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) [junit-4.13.1.jar:4.13.1] [2022-11-16T14:01:41.427Z] at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) [junit-4.13.1.jar:4.13.1] [2022-11-16T14:01:41.427Z] at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) [junit-4.13.1.jar:4.13.1] [2022-11-16T14:01:41.427Z] at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) [junit-4.13.1.jar:4.13.1] [2022-11-16T14:01:41.427Z] at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) [junit-4.13.1.jar:4.13.1] [2022-11-16T14:01:41.427Z] at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) [junit-4.13.1.jar:4.13.1] [2022-11-16T14:01:41.427Z] at org.junit.runners.ParentRunner.run(ParentRunner.java:413) [junit-4.13.1.jar:4.13.1] [2022-11-16T14:01:41.427Z] at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55) [surefire-junit47-2.22.2.jar:2.22.2] [2022-11-16T14:01:41.427Z] at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137) [surefire-junit47-2.22.2.jar:2.22.2] [2022-11-16T14:01:41.427Z] at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107) [surefire-junit47-2.22.2.jar:2.22.2] [2022-11-16T14:01:41.427Z] at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83) [surefire-junit47-2.22.2.jar:2.22.2] [2022-11-16T14:01:41.427Z] at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75) [surefire-junit47-2.22.2.jar:2.22.2] [2022-11-16T14:01:41.427Z] at org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:158) [surefire-junit47-2.22.2.jar:2.22.2] [2022-11-16T14:01:41.427Z] at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) [surefire-booter-2.22.2.jar:2.22.2] [2022-11-16T14:01:41.427Z] at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) [surefire-booter-2.22.2.jar:2.22.2] [2022-11-16T14:01:41.427Z] at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) [surefire-booter-2.22.2.jar:2.22.2] [2022-11-16T14:01:41.427Z] at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) [surefire-booter-2.22.2.jar:2.22.2] [2022-11-16T14:01:41.427Z] Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: The request timed out. [2022-11-16T14:01:41.427Z] at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45) ~[kafka-clients-2.6.0.jar:?] [2022-11-16T14:01:41.427Z] at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32) ~[kafka-clients-2.6.0.jar:?] [2022-11-16T14:01:41.427Z] at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:104) ~[kafka-clients-2.6.0.jar:?] [2022-11-16T14:01:41.427Z] at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:272) ~[kafka-clients-2.6.0.jar:?] [2022-11-16T14:01:41.427Z] at org.nuxeo.lib.stream.log.kafka.KafkaUtils.createTopic(KafkaUtils.java:173) ~[nuxeo-stream-PR-608-2023.0-SNAPSHOT.jar:?]
Instead we should raise a StreamRuntimeException with proper message.
Note that test on CI is using request.timeout.ms which is 10s and might not be enough depending on CI load:
https://github.com/nuxeo/nuxeo-lts/blob/2021/modules/runtime/nuxeo-runtime-stream/src/test/resources/OSGI-INF/test-stream-kafka-contrib.xml#L8
<admin> <property name="bootstrap.servers">${nuxeo.test.kafka.servers}</property> <property name="request.timeout.ms">10000</property> </admin>
- is related to
-
NXP-31444 Random Kafka timeout at topic creation or metadata propagation
- Resolved