-
Type: Bug
-
Status: Resolved
-
Priority: Major
-
Resolution: Duplicate
-
Affects Version/s: 10.2-SNAPSHOT
-
Fix Version/s: 10.3
-
Component/s: Audit, Elasticsearch
The following stack trace can appear in server logs:
2018-07-12 16:59:23,007 INFO [http-nio-0.0.0.0-8080-exec-18] [org.nuxeo.runtime.reload.ReloadComponent] Before stop/standby component manager 2018-07-12 16:59:24,196 WARN [http-nio-0.0.0.0-8080-exec-18] [org.nuxeo.lib.stream.computation.log.ComputationPool] AuditLogWriter: Timeout on wait for pool termination 2018-07-12 16:59:24,197 ERROR [AuditLogWriterPool-00,in:17,inCheckpoint:15,out:0,lastRead:1531414762251,lastTimer:1531414762027,wm:200725595649605633,loop:29,record] [org.nuxeo.lib.stream.computation.log.ComputationRunner] AuditLogWriter: Exception in processLoop: thread waiting for the response was interrupted java.lang.RuntimeException: thread waiting for the response was interrupted at org.elasticsearch.client.RestClient$SyncResponseListener.get(RestClient.java:687) at org.elasticsearch.client.RestClient.performRequest(RestClient.java:235) at org.elasticsearch.client.RestClient.performRequest(RestClient.java:198) at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:522) at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:508) at org.elasticsearch.client.RestHighLevelClient.index(RestHighLevelClient.java:348) at org.nuxeo.elasticsearch.client.ESRestClient.index(ESRestClient.java:379) at org.nuxeo.elasticsearch.seqgen.ESUIDSequencer.getNextLong(ESUIDSequencer.java:93) at org.nuxeo.elasticsearch.audit.ESAuditBackend.addLogEntries(ESAuditBackend.java:483) at org.nuxeo.ecm.platform.audit.impl.StreamAuditWriter$AuditLogWriterComputation.writeEntriesToAudit(StreamAuditWriter.java:127) at org.nuxeo.ecm.platform.audit.impl.StreamAuditWriter$AuditLogWriterComputation.processTimer(StreamAuditWriter.java:96) at org.nuxeo.lib.stream.computation.log.ComputationRunner.lambda$processTimer$2(ComputationRunner.java:246) at java.util.LinkedHashMap.forEach(LinkedHashMap.java:676) at org.nuxeo.lib.stream.computation.log.ComputationRunner.processTimer(ComputationRunner.java:244) at org.nuxeo.lib.stream.computation.log.ComputationRunner.processLoop(ComputationRunner.java:193) at org.nuxeo.lib.stream.computation.log.ComputationRunner.run(ComputationRunner.java:152) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1039) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) at org.elasticsearch.client.RestClient$SyncResponseListener.get(RestClient.java:683) ... 20 more
This is extracted from hot-reload server tests server.log
From Benoit Delbosc:
The situation here is that on hot reload the stream service is restarted, during stop we wait for computation to terminate their current processing, in this case we wait 1s for the AuditLogWriter computation but it is not enough and there is a timeout WARN the stop procedure continue and interrupt the computation raising an ERROR.
So there is no bug here, the behavior is correct. This happens probably because elasticsearch is slow and impacted byNXP-25355
This could be mitigated by remove error in logs by:
- wait few seconds after actions and doing hot reload
- log all interrupted exception as WARN