[NXP-28563] Optimized chunked batch upload with S3 - Nuxeo Issue Tracker

XML

Word

Printable

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: QualifiedToSchedule
Component/s: Nuxeo Drive, S3

Team:
FG
Sprint:
nxFG 11.1.13, nxFG 11.3.1
Story Points:
5

Description

Today when we do a chunked batch upload (from Drive typically) on a Nuxeo instance configured with S3 transient stores (which is the common case), we have inefficiencies:

first each chunk is uploaded form the client to Nuxeo, then saved to S3
then in BatchFileEntry.getBlob we download all chunks to the filesystem, concatenate them into a blob
then when this blob is saved (typically in a document) it is uploaded to S3.

This should be optimized to upload avoid doing the concatenation on the Nuxeo side, and instead use S3 Multipart Upload to do the concatenation fully S3-side.

To do that we'd need to add a notion of partial upload to the S3BlobStore (leveraging the BlobWriteContext), then add the same notion to transient stores, and leverage that from the Batch infrastructure.

Attachments

Issue Links

is required by

NXDRIVE-2383 error on complete for a 78GB file

Resolved

Activity

People

Assignee:

Florent Guillaume

Reporter:

Florent Guillaume

Participants:

Florent Guillaume

Votes:

1 Vote for this issue

Watchers:

3 Start watching this issue

Dates

Created:

2020-01-23 16:34

Updated:

2020-12-03 13:42