[NXP-25399] Fix random AuditLogWriter error in logs during functional tests - Nuxeo Issue Tracker

XML

Word

Printable

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: 10.2-SNAPSHOT
Fix Version/s: 10.3
Component/s: Audit, Elasticsearch

Tags:
- nxcore
- qaduty

Description

The following stack trace can appear in server logs:

2018-07-12 16:59:23,007 INFO  [http-nio-0.0.0.0-8080-exec-18] [org.nuxeo.runtime.reload.ReloadComponent] Before stop/standby component manager
2018-07-12 16:59:24,196 WARN  [http-nio-0.0.0.0-8080-exec-18] [org.nuxeo.lib.stream.computation.log.ComputationPool] AuditLogWriter: Timeout on wait for pool termination
2018-07-12 16:59:24,197 ERROR [AuditLogWriterPool-00,in:17,inCheckpoint:15,out:0,lastRead:1531414762251,lastTimer:1531414762027,wm:200725595649605633,loop:29,record] [org.nuxeo.lib.stream.computation.log.ComputationRunner] AuditLogWriter: Exception in processLoop: thread waiting for the response was interrupted
java.lang.RuntimeException: thread waiting for the response was interrupted
	at org.elasticsearch.client.RestClient$SyncResponseListener.get(RestClient.java:687)
	at org.elasticsearch.client.RestClient.performRequest(RestClient.java:235)
	at org.elasticsearch.client.RestClient.performRequest(RestClient.java:198)
	at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:522)
	at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:508)
	at org.elasticsearch.client.RestHighLevelClient.index(RestHighLevelClient.java:348)
	at org.nuxeo.elasticsearch.client.ESRestClient.index(ESRestClient.java:379)
	at org.nuxeo.elasticsearch.seqgen.ESUIDSequencer.getNextLong(ESUIDSequencer.java:93)
	at org.nuxeo.elasticsearch.audit.ESAuditBackend.addLogEntries(ESAuditBackend.java:483)
	at org.nuxeo.ecm.platform.audit.impl.StreamAuditWriter$AuditLogWriterComputation.writeEntriesToAudit(StreamAuditWriter.java:127)
	at org.nuxeo.ecm.platform.audit.impl.StreamAuditWriter$AuditLogWriterComputation.processTimer(StreamAuditWriter.java:96)
	at org.nuxeo.lib.stream.computation.log.ComputationRunner.lambda$processTimer$2(ComputationRunner.java:246)
	at java.util.LinkedHashMap.forEach(LinkedHashMap.java:676)
	at org.nuxeo.lib.stream.computation.log.ComputationRunner.processTimer(ComputationRunner.java:244)
	at org.nuxeo.lib.stream.computation.log.ComputationRunner.processLoop(ComputationRunner.java:193)
	at org.nuxeo.lib.stream.computation.log.ComputationRunner.run(ComputationRunner.java:152)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.InterruptedException
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1039)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
	at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
	at org.elasticsearch.client.RestClient$SyncResponseListener.get(RestClient.java:683)
	... 20 more

This is extracted from hot-reload server tests server.log

From Benoit Delbosc:

The situation here is that on hot reload the stream service is restarted, during stop we wait for computation to terminate their current processing, in this case we wait 1s for the AuditLogWriter computation but it is not enough and there is a timeout WARN the stop procedure continue and interrupt the computation raising an ERROR.
So there is no bug here, the behavior is correct. This happens probably because elasticsearch is slow and impacted by ~~NXP-25355~~
This could be mitigated by remove error in logs by:

wait few seconds after actions and doing hot reload

log all interrupted exception as WARN

Attachments

Options
- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

Attachments

server.log
161 kB
2018-07-13 06:16

Issue Links

duplicates

NXP-25369 Fix hotreload ftest that generates error log

Resolved

is related to

NXP-25355 AuditLogWriter should fetch entry ids by block

Resolved

Activity

People

Assignee:

Funsho David

Reporter:

Anahide Tchertchian

Participants:

Anahide Tchertchian, Funsho David

Votes:

0 Vote for this issue

Watchers:

1 Start watching this issue

Dates

Created:

2018-07-13 06:15

Updated:

2018-07-13 07:51

Resolved:

2018-07-13 07:50