[NXP-30928] DBS ReadACL propagation might be corrupted when distributed - Nuxeo Issue Tracker

XML

Word

Printable

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 10.10
Fix Version/s: 10.10-HF61, 2021.20
Component/s: Core DBS

Release Notes Summary:
DBS ReadACL propagation is more robust.
Tags:
- SupCom
- nxplatform
Backlog priority:
900
Sprint:
nxplatform #58, nxplatform #59, nxplatform #60
Story Points:
3

Description

When updating an ACL on a folderish document (with more than 500 docs) in a Nuxeo cluster,
we might update wrongly the read ACL for children depending on the state of the folderish document in worker nodes caches.

I've been able to identify a case where a work starts before all invalidations have been processed.

Setup: 2 Nuxeo nodes (1 portal and 1 worker) with MongoDB and Kafka where nuxeo.work.queue.common.enabled is set to true only on the worker node

Steps to observe the problem:

on portal, update the permission on a container which has more than 500 children, ideally with a tree structure. In the example below, I grant the READ permission to "thierry"
on worker, the read acls are propagated to all children thanks to FindReadAclsWork/UpdateReadAclsWork

this sequence is observed in the logs where one of the first document updated with the propagated read acls does not contain "thierry" whereas the following document has the expected value for the read acls

22:01:10,940 INFO [defaultPool-01] [DBSTransactionState] updateDocumentReadAclsNoCache: 50063a35-ad8f-46e6-b790-3083ffee2a93
22:01:10,940 INFO [defaultPool-01] [DBSTransactionState]    -> getReadACL = [Administrator]
22:01:10,941 INFO [defaultPool-01] [DBSTransactionState] updateDocumentReadAclsNoCache: 6ba7333f-0fc1-41a0-84dc-393e06d748fd
22:01:10,942 INFO [defaultPool-01] [DBSTransactionState]    -> getReadACL = [Administrator, thierry]

These logs correspond to messages I've added to track the problem.

The full name of the thread is "defaultPool-01,in:7,inCheckpoint:7,out:0,lastRead:1646859595463,lastTimer:0,wm:215857180896133121,loop:24196,checkpoint:238056104979578.1268777355"

Is it possible to be sure all invalidations are processed (or at least the ones including the documents involved in the process) before processing the works?

Attachments

Issue Links

is related to

NXP-28296 Bulk Service should have a transactional behavior

Resolved

Activity

People

Assignee:

Benoit Delbosc

Reporter:

Thierry Martins

Participants:

Benoit Delbosc, Jenkins, Thierry Martins

Votes:

0 Vote for this issue

Watchers:

4 Start watching this issue

Dates

Created:

2022-03-10 10:32

Updated:

2022-05-16 09:07

Resolved:

2022-05-03 08:45