-
Type: Bug
-
Status: Resolved
-
Priority: Major
-
Resolution: Fixed
-
Affects Version/s: 2.5.0, 3.0.0
-
Component/s: CI/CD
-
Tags:
-
Team:AI
ES randomly fails the unit tests.
See https://jenkins.ai.dev.nuxeo.com/job/nuxeo/job/nuxeo-ai/job/sprint-3b/7/testReport/, for instance org.nuxeo.ai.bulk.BulkEnrichmentTest:
Error while invoking beforeRun on features: [org.nuxeo.ai.enrichment.EnrichmentTestFeature, org.nuxeo.runtime.test.runner.MDCFeature, org.nuxeo.runtime.test.runner.ConditionalIgnoreRule$Feature, org.nuxeo.runtime.test.runner.RandomBug$Feature, org.nuxeo.runtime.test.runner.WithFrameworkPropertyFeature, org.nuxeo.runtime.test.runner.RuntimeFeature, org.nuxeo.runtime.cluster.ClusterFeature, org.nuxeo.runtime.test.runner.TransactionalFeature, org.nuxeo.runtime.stream.RuntimeStreamFeature, org.nuxeo.ecm.core.api.local.DummyLoginFeature, org.nuxeo.ecm.core.work.WorkManagerFeature, org.nuxeo.ecm.core.bulk.CoreBulkFeature, org.nuxeo.ecm.core.test.CoreFeature, org.nuxeo.directory.test.DirectoryFeature, org.nuxeo.ecm.platform.test.UserManagerFeature, org.nuxeo.ecm.platform.test.PlatformFeature, org.nuxeo.ecm.automation.core.AutomationCoreFeature, org.nuxeo.ecm.automation.test.AutomationFeature, org.nuxeo.ecm.platform.test.NuxeoLoginFeature, org.nuxeo.runtime.test.runner.LogFeature, org.nuxeo.elasticsearch.test.RepositoryLightElasticSearchFeature, org.nuxeo.elasticsearch.test.RepositoryElasticSearchFeature] Trace d'appels java.lang.AssertionError: Error while invoking beforeRun on features: [org.nuxeo.ai.enrichment.EnrichmentTestFeature, org.nuxeo.runtime.test.runner.MDCFeature, org.nuxeo.runtime.test.runner.ConditionalIgnoreRule$Feature, org.nuxeo.runtime.test.runner.RandomBug$Feature, org.nuxeo.runtime.test.runner.WithFrameworkPropertyFeature, org.nuxeo.runtime.test.runner.RuntimeFeature, org.nuxeo.runtime.cluster.ClusterFeature, org.nuxeo.runtime.test.runner.TransactionalFeature, org.nuxeo.runtime.stream.RuntimeStreamFeature, org.nuxeo.ecm.core.api.local.DummyLoginFeature, org.nuxeo.ecm.core.work.WorkManagerFeature, org.nuxeo.ecm.core.bulk.CoreBulkFeature, org.nuxeo.ecm.core.test.CoreFeature, org.nuxeo.directory.test.DirectoryFeature, org.nuxeo.ecm.platform.test.UserManagerFeature, org.nuxeo.ecm.platform.test.PlatformFeature, org.nuxeo.ecm.automation.core.AutomationCoreFeature, org.nuxeo.ecm.automation.test.AutomationFeature, org.nuxeo.ecm.platform.test.NuxeoLoginFeature, org.nuxeo.runtime.test.runner.LogFeature, org.nuxeo.elasticsearch.test.RepositoryLightElasticSearchFeature, org.nuxeo.elasticsearch.test.RepositoryElasticSearchFeature] at org.nuxeo.runtime.test.runner.FeaturesRunner.apply(FeaturesRunner.java:253) at org.nuxeo.runtime.test.runner.FeaturesRunner.apply(FeaturesRunner.java:225) at org.nuxeo.runtime.test.runner.FeaturesRunner.beforeRun(FeaturesRunner.java:189) at org.nuxeo.runtime.test.runner.FeaturesRunner$BeforeClassStatement.evaluate(FeaturesRunner.java:323) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.junit.runners.Suite.runChild(Suite.java:128) at org.junit.runners.Suite.runChild(Suite.java:27) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55) at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137) at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107) at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83) at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75) at org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:158) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) Suppressed: org.nuxeo.ecm.core.api.NuxeoException: org.elasticsearch.client.ResponseException: method [POST], host [http://localhost:9200], URI [/nxutest/_refresh], status line [HTTP/1.1 429 Too Many Requests] {"error":{"root_cause":[{"type":"circuit_breaking_exception","reason":"[parent] Data too large, data for [<http_request>] would be [909074512/866.9mb], which is larger than the limit of [906992025/864.9mb], real usage: [909074512/866.9mb], new bytes reserved: [0/0b], usages [request=0/0b, fielddata=0/0b, in_flight_requests=0/0b, accounting=7876/7.6kb]","bytes_wanted":909074512,"bytes_limit":906992025,"durability":"PERMANENT"}],"type":"circuit_breaking_exception","reason":"[parent] Data too large, data for [<http_request>] would be [909074512/866.9mb], which is larger than the limit of [906992025/864.9mb], real usage: [909074512/866.9mb], new bytes reserved: [0/0b], usages [request=0/0b, fielddata=0/0b, in_flight_requests=0/0b, accounting=7876/7.6kb]","bytes_wanted":909074512,"bytes_limit":906992025,"durability":"PERMANENT"},"status":429} at org.nuxeo.elasticsearch.client.ESRestClient.performRequest(ESRestClient.java:214) at org.nuxeo.elasticsearch.client.ESRestClient.performRequestWithTracing(ESRestClient.java:220) at org.nuxeo.elasticsearch.client.ESRestClient.refresh(ESRestClient.java:129) at org.nuxeo.elasticsearch.core.ElasticSearchAdminImpl.refreshRepositoryIndex(ElasticSearchAdminImpl.java:210) at org.nuxeo.elasticsearch.core.ElasticSearchAdminImpl.refresh(ElasticSearchAdminImpl.java:273) at org.nuxeo.elasticsearch.ElasticSearchComponent.refresh(ElasticSearchComponent.java:370) at org.nuxeo.elasticsearch.test.RepositoryLightElasticSearchFeature.await(RepositoryLightElasticSearchFeature.java:76) at org.nuxeo.runtime.test.runner.TransactionalFeature.await(TransactionalFeature.java:124) at org.nuxeo.runtime.test.runner.TransactionalFeature.nextTransaction(TransactionalFeature.java:104) at org.nuxeo.ecm.core.test.CoreFeature.beforeRun(CoreFeature.java:191) at org.nuxeo.runtime.test.runner.FeaturesRunner.lambda$beforeRun$1(FeaturesRunner.java:189) at org.nuxeo.runtime.test.runner.FeaturesRunner.apply(FeaturesRunner.java:239) ... 25 more Caused by: org.elasticsearch.client.ResponseException: method [POST], host [http://localhost:9200], URI [/nxutest/_refresh], status line [HTTP/1.1 429 Too Many Requests] {"error":{"root_cause":[{"type":"circuit_breaking_exception","reason":"[parent] Data too large, data for [<http_request>] would be [909074512/866.9mb], which is larger than the limit of [906992025/864.9mb], real usage: [909074512/866.9mb], new bytes reserved: [0/0b], usages [request=0/0b, fielddata=0/0b, in_flight_requests=0/0b, accounting=7876/7.6kb]","bytes_wanted":909074512,"bytes_limit":906992025,"durability":"PERMANENT"}],"type":"circuit_breaking_exception","reason":"[parent] Data too large, data for [<http_request>] would be [909074512/866.9mb], which is larger than the limit of [906992025/864.9mb], real usage: [909074512/866.9mb], new bytes reserved: [0/0b], usages [request=0/0b, fielddata=0/0b, in_flight_requests=0/0b, accounting=7876/7.6kb]","bytes_wanted":909074512,"bytes_limit":906992025,"durability":"PERMANENT"},"status":429} at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:302) at org.elasticsearch.client.RestClient.performRequest(RestClient.java:272) at org.elasticsearch.client.RestClient.performRequest(RestClient.java:246) at org.nuxeo.elasticsearch.client.ESRestClient.performRequest(ESRestClient.java:212) ... 36 more
Observed on the pod during the tests:
CPU MEM CPU/R:L MEM/R:L 2797 1690 2000:4000 2048:4096
The circuit breaker parent rejects the requests because of a fielddata overflow (9KB + 866MB). The limit is tied to the JVM HEAP size.
See https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html
TODO:
- review the code: there may be more field uselessly set as fielddata
- tune ES or Maven or Surefire ? MAVEN_OPTS Xmx
- activate ES for the two previews; that's not related to the unit tests issue but worth to also improve
see https://github.com/nuxeo/nuxeo-helm-chart/blob/master/nuxeo/values.yaml#L122 , current chart is 1.0.14 - consider not using the embedded ES, even for the unit tests
=> for the functional tests, wait for related Platform improvements on their next chart - lower requested resources
- ...
- Is referenced in
- links to