Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-23016

Fix infinite cross-instance cache invalidations in cluster mode

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 9.3-SNAPSHOT
    • Fix Version/s: 9.3
    • Component/s: Cache, Redis

      Description

      In cluster mode, receiving an invalidation triggers a re-send of this invalidation to other nodes, creating an infinite loop:

      2017-09-04 11:34:03,805 ERROR [Nuxeo-PubSub-Redis] [org.nuxeo.ecm.core.pubsub.AbstractPubSubProvider] Exception in subscriber for topic: cacheinval
      redis.clients.jedis.exceptions.JedisDataException: ERR only (P)SUBSCRIBE / (P)UNSUBSCRIBE / QUIT allowed in this context
      	at redis.clients.jedis.Protocol.processError(Protocol.java:127)
      	at redis.clients.jedis.Protocol.process(Protocol.java:161)
      	at redis.clients.jedis.Protocol.read(Protocol.java:215)
      	at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:340)
      	at redis.clients.jedis.Connection.getIntegerReply(Connection.java:265)
      	at redis.clients.jedis.BinaryJedis.publish(BinaryJedis.java:3064)
      	at org.nuxeo.ecm.core.redis.contribs.RedisPubSubProvider.lambda$publish$1(RedisPubSubProvider.java:255)
      	at org.nuxeo.ecm.core.redis.RedisPoolExecutor.execute(RedisPoolExecutor.java:49)
      	at org.nuxeo.ecm.core.redis.RedisFailoverExecutor$1.retry(RedisFailoverExecutor.java:62)
      	at org.nuxeo.ecm.core.redis.retry.Retry.retry(Retry.java:61)
      	at org.nuxeo.ecm.core.redis.RedisFailoverExecutor.executeWithRetryPolicy(RedisFailoverExecutor.java:57)
      	at org.nuxeo.ecm.core.redis.RedisFailoverExecutor.execute(RedisFailoverExecutor.java:43)
      	at org.nuxeo.ecm.core.redis.contribs.RedisPubSubProvider.publish(RedisPubSubProvider.java:255)
      	at org.nuxeo.ecm.core.pubsub.PubSubServiceImpl.publish(PubSubServiceImpl.java:137)
      	at org.nuxeo.ecm.core.pubsub.AbstractPubSubBroker.sendMessage(AbstractPubSubBroker.java:113)
      	at org.nuxeo.ecm.core.cache.CacheServiceImpl$CachePubSubInvalidator.sendInvalidationsAll(CacheServiceImpl.java:190)
      	at org.nuxeo.ecm.core.cache.CacheInvalidator.invalidateAll(CacheInvalidator.java:54)
      	at org.nuxeo.ecm.core.cache.CacheServiceImpl$CachePubSubInvalidator.receivedMessage(CacheServiceImpl.java:199)
      	at org.nuxeo.ecm.core.cache.CacheServiceImpl$CachePubSubInvalidator.receivedMessage(CacheServiceImpl.java:176)
      	at org.nuxeo.ecm.core.pubsub.AbstractPubSubBroker.subscriber(AbstractPubSubBroker.java:140)
      	at org.nuxeo.ecm.core.pubsub.AbstractPubSubProvider.localPublish(AbstractPubSubProvider.java:63)
      	at org.nuxeo.ecm.core.redis.contribs.RedisPubSubProvider$Dispatcher.onMessage(RedisPubSubProvider.java:170)
      	at org.nuxeo.ecm.core.redis.contribs.RedisPubSubProvider$Dispatcher.onPMessage(RedisPubSubProvider.java:174)
      	at org.nuxeo.ecm.core.redis.contribs.RedisPubSubProvider$Dispatcher.processBinary(RedisPubSubProvider.java:222)
      	at org.nuxeo.ecm.core.redis.contribs.RedisPubSubProvider$Dispatcher.proceedWithPatterns(RedisPubSubProvider.java:188)
      	at redis.clients.jedis.Jedis.psubscribe(Jedis.java:2697)
      	at org.nuxeo.ecm.core.redis.RedisExecutor.lambda$psubscribe$1(RedisExecutor.java:87)
      	at org.nuxeo.ecm.core.redis.RedisPoolExecutor.execute(RedisPoolExecutor.java:63)
      	at org.nuxeo.ecm.core.redis.RedisFailoverExecutor$1.retry(RedisFailoverExecutor.java:62)
      	at org.nuxeo.ecm.core.redis.retry.Retry.retry(Retry.java:61)
      	at org.nuxeo.ecm.core.redis.RedisFailoverExecutor.executeWithRetryPolicy(RedisFailoverExecutor.java:57)
      	at org.nuxeo.ecm.core.redis.RedisFailoverExecutor.execute(RedisFailoverExecutor.java:43)
      	at org.nuxeo.ecm.core.redis.RedisExecutor.psubscribe(RedisExecutor.java:86)
      	at org.nuxeo.ecm.core.redis.contribs.RedisPubSubProvider$Dispatcher.run(RedisPubSubProvider.java:142)
      

      The error seen here is due to the fact that during the receiveMessage part of the pub/sub mechanism in Jedis we re-use the same connection (in the same thread) to send the next (and incorrect) publish, but this connection is reserved to receiving pub/sub messages.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 3 hours
                  3h