Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-19443

Drive: optimize ScrollDescendants operation by avoiding DocumentModel loading

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: Postponed
    • Component/s: Nuxeo Drive, Performance

      Description

      Analysis:
      JVM sampling during a call to GetChildren shows that a lot of time is spent loading DocumentModel from VCS.
      Analyzing the results of the GetChildren bench also shows this, see NXP-19209.

      Attached visualvm snapshot shows that on a call with 1.000 children:

      • Approximatively 1400 ms come from DefaultFileSystemItemFactory#isFileSystemItem, in fact #hasBlob.
      • Approximatively 4400ms come from PageProvider#getCurrentPage

      We don't want to maintain GetChildren because:

      • It won't scale on a big hierarchy
      • It relies on a PageProvider with a limited page size (1000 by default) with no pagination on the Drive side

      => That's why we added the ScrollDescendants API, see NXP-19482.

      Yet ScrollDescendants also relies on a CoreSession#query loading the whole DocumentModel from VCS
      => We need to see how we can use CoreSession#queryAndFetch in DocumentBackedFolderItem#scrollDescendants (VCS and ES implementations) to retrieve all the document properties, including the blob, needed for:

      • DefaultFileSystemItemFactory#isFileSystemItem
      • DefaultFileSystemItemFactory#adaptDocument

      Possible issues:

      • Fetching the blob.
      • Still needing to go through the CoreSession to get lock info, lifecycle state, permissions.

      Notes:

      • If ever we really needed to apply this optimization to GetChildren we would need to use a CoreQueryAndFetchPageProvider.
      • In the case of ESSyncRootFolderItem we could consider fetching the docs from the ES source directly, but need to check / handle NXP-16396. In this case we wouldn't need to refactor the FileSystemItemFactory API to rely on a Map of "query and fetched" properties instead of DocumentModel.
      • We will want to measure the difference between the current and the new queryAndFectch implementation with:
        • the Gatling bench
        • profiling

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: