Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-30239

Delay the start of Quartz scheduler to avoid database constraint violations

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.10
    • Fix Version/s: 10.10-HF44, 11.x, 2021.19
    • Component/s: Scheduler
    • Release Notes Summary:
      Delay Quartz scheduler start for 5s by default.
    • Backlog priority:
      800
    • Sprint:
      nxsupport 14, nxplatform #58, nxplatform #59
    • Story Points:
      1

      Description

      This problem has been found in a K8s environment with 6 worker nodes (actually 2 should be enough to reproduce) when the lag (response time) with the PostgreSQL database could provoke unique key constraint violation when Quartz scheduler is starting.

      The analysis of Nuxeo traces and PostgreSQL query logs showed that 2 concurrent threads are running queries during the initialization the Quartz data and trying to both acquire a TRIGGER_ACCESS lock on the table qrtz_LOCKS

      Caused by: org.postgresql.util.PSQLException: ERROR: current transaction is aborted, commands ignored until end of transaction block
      	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2440) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2183) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:308) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:143) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:106) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.tranql.connector.jdbc.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:52) ~[tranql-connector-1.8.jar:1.8]
      	at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:96) ~[quartz-2.3.0.jar:?]
      	... 59 more
      Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "qrtz_locks_pkey"
        Detail: Key (sched_name, lock_name)=(Quartz, TRIGGER_ACCESS) already exists.
      	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2440) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2183) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:308) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:143) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:120) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.tranql.connector.jdbc.PreparedStatementHandle.executeUpdate(PreparedStatementHandle.java:186) ~[tranql-connector-1.8.jar:1.8]
      	at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:108) ~[quartz-2.3.0.jar:?]
      	... 59 more 

      A simple solution is to delay the start of Quartz scheduler.

      Note that this issue is different from problems with concurrent startups where Nuxeo has to implement a cluster-wide lock. However it's possible that the cluster-wide lock will also be required .

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 0 minutes
                  0m
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 1 hour, 40 minutes
                  1h 40m

                    PagerDuty

                    Error rendering 'com.pagerduty.jira-server-plugin:PagerDuty'. Please contact your Jira administrators.