-
Type: Improvement
-
Status: Open
-
Priority: Minor
-
Resolution: Unresolved
-
Affects Version/s: None
-
Fix Version/s: Postponed
-
Component/s: Nuxeo Drive, Performance
-
Epic Link:
-
Story Points:13
Analysis:
JVM sampling during a call to GetChildren shows that a lot of time is spent loading DocumentModel from VCS.
Analyzing the results of the GetChildren bench also shows this, see NXP-19209.
Attached visualvm snapshot shows that on a call with 1.000 children:
- Approximatively 1400 ms come from DefaultFileSystemItemFactory#isFileSystemItem, in fact #hasBlob.
- Approximatively 4400ms come from PageProvider#getCurrentPage
We don't want to maintain GetChildren because:
- It won't scale on a big hierarchy
- It relies on a PageProvider with a limited page size (1000 by default) with no pagination on the Drive side
=> That's why we added the ScrollDescendants API, see NXP-19482.
Yet ScrollDescendants also relies on a CoreSession#query loading the whole DocumentModel from VCS
=> We need to see how we can use CoreSession#queryAndFetch in DocumentBackedFolderItem#scrollDescendants (VCS and ES implementations) to retrieve all the document properties, including the blob, needed for:
- DefaultFileSystemItemFactory#isFileSystemItem
- DefaultFileSystemItemFactory#adaptDocument
Possible issues:
- Fetching the blob.
- Still needing to go through the CoreSession to get lock info, lifecycle state, permissions.
Notes:
- If ever we really needed to apply this optimization to GetChildren we would need to use a CoreQueryAndFetchPageProvider.
- In the case of ESSyncRootFolderItem we could consider fetching the docs from the ES source directly, but need to check / handle
NXP-16396. In this case we wouldn't need to refactor the FileSystemItemFactory API to rely on a Map of "query and fetched" properties instead of DocumentModel. - We will want to measure the difference between the current and the new queryAndFectch implementation with:
- the Gatling bench
- profiling
- depends on
-
NXP-16396 Loading from Elasticsearch should allow to easily scale out
- Resolved
-
NXP-19388 Setup a bench for the Nuxeo Drive optimized remote scan
- Resolved
-
NXP-19482 Drive: Optimize remote scan execution by using a scroll API
- Resolved
-
NXP-19586 Drive: Implement Elasticsearch based batched remote scan
- Resolved
-
NXP-19441 Drive: remove costly and unnecessary calls to hasPermission in FileSystemItem adaptation
- Resolved
-
NXP-19442 Drive: remove costly and unnecessary call to getLockInfo in FileSystemItem adaptation when calling GetChildren / ScrollDescendants
- Resolved
-
NXP-19209 Setup a bench for Nuxeo Drive remote scan
- Resolved
- is related to
-
NXP-23719 Reduce the number of requests sent to the audit
- Open
-
NXP-12632 Drive: optimize audit log query
- Open
-
NXDRIVE-632 Stability improvement
- Resolved
-
NXP-24232 Improve getUpperBound query when storing audit in Elasticsearch
- Resolved