Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-30239

Delay the start of Quartz scheduler to avoid database constraint violations

    Details

    • Type: Bug
    • Status: In Review
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 10.10
    • Fix Version/s: 10.10-HF44, 2021.x, 11.x
    • Component/s: Scheduler
    • Backlog priority:
      800
    • Sprint:
      nxsupport 14, nxsupport 15

      Description

      This problem has been found in a K8s environment with 6 worker nodes (actually 2 should be enough to reproduce) when the lag (response time) with the PostgreSQL database could provoke unique key constraint violation when Quartz scheduler is starting.

      The analysis of Nuxeo traces and PostgreSQL query logs showed that 2 concurrent threads are running queries during the initialization the Quartz data and trying to both acquire a TRIGGER_ACCESS lock on the table qrtz_LOCKS

      Caused by: org.postgresql.util.PSQLException: ERROR: current transaction is aborted, commands ignored until end of transaction block
      	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2440) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2183) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:308) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:143) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:106) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.tranql.connector.jdbc.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:52) ~[tranql-connector-1.8.jar:1.8]
      	at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:96) ~[quartz-2.3.0.jar:?]
      	... 59 more
      Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "qrtz_locks_pkey"
        Detail: Key (sched_name, lock_name)=(Quartz, TRIGGER_ACCESS) already exists.
      	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2440) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2183) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:308) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:143) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:120) ~[postgresql-42.2.5.jar:42.2.5]
      	at org.tranql.connector.jdbc.PreparedStatementHandle.executeUpdate(PreparedStatementHandle.java:186) ~[tranql-connector-1.8.jar:1.8]
      	at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:108) ~[quartz-2.3.0.jar:?]
      	... 59 more 

      A simple solution is to delay the start of Quartz scheduler.

      Note that this issue is different from problems with concurrent startups where Nuxeo has to implement a cluster-wide lock. However it's possible that the cluster-wide lock will also be required .

        Attachments

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0 minutes
                0m
                Logged:
                Time Spent - 1 hour, 40 minutes
                1h 40m

                  PagerDuty

                  Error rendering 'com.pagerduty.jira-server-plugin:PagerDuty'. Please contact your Jira administrators.