Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-31893

Document move should scale and be asynchronous

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2021
    • Fix Version/s: 2023.0, 2021.40
    • Component/s: Core DBS
    • Release Notes Summary:
      Moving large folder is now more scalable and asynchronous
    • Backlog priority:
      800
    • Sprint:
      nxplatform #89, nxplatform #90
    • Story Points:
      8

      Description

      When moving a folder, the ecm:ancestorIds field must be updated for all descendants, today this is done atomically in sync before materializing the new read acl (updading all descendants ecm:racl) which is done asynchronously for sub-folder descendants.
      Because read acl computation depends on ancestors, both cannot be run concurrently.

      Also, it has been observed that the current implementation is limited by the number of docs being move (around 800k) because of query filter trying to ignore the documents ids manipulate in the current transaction (current implementation is loading all descendants), the mongodb query filter is bigger than the 16MB limit:

      org.nuxeo.ecm.automation.OperationException Failed to invoke operation Document.Move
      BsonBinaryWriter.java#validateSize bson-4.7.2.jar org.bson.BsonMaximumSizeExceededException Document size of 18500467 is larger than maximum of 16793600.
      MongoDBConnection#stream:759 nuxeo-core-storage-mongodb-2021.39.2-PR-1214-BUILD-3.jar
      DBSSession#getVersionsIds
      DBSTransactionState#updateTreeReadAcls
      

      Materialized fields could be merged (ancestors + read ACL) into one update to be done asynchronously for sub-folders, fixing both the sync latency and scaling limitation.

      The move operation should be part of the continuous integration benchmark.

      ----------

      A client reported slow processing when moving or copying folderish documents containing more than 100k objects. In a support team discussion it was noted that there is an asynchronous process for updating ACLs - we expect it would be feasible to use a similar technique for updating document ancestors. 

      https://github.com/nuxeo/nuxeo/blob/2021/modules/core/nuxeo-core-storage-dbs/src/main/java/org/nuxeo/ecm/core/storage/dbs/DBSTransactionState.java#L513

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: