[NXP-16952] Allow to directly upload into the backend storage - Nuxeo Issue Tracker

XML

Word

Printable

Details

Type: Improvement
Status: Open
Priority: Minor
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: QualifiedToSchedule
Component/s: File Upload , Nuxeo Drive

Tags:
- toCheck

Description

When uploading large file, the upload from the client side will take a long time because of the client bandwidth limitations.
One uploaded on the server side the file will be pushed to the BlobManager : this can take time too.

The whole process would be more efficient if :

BlobManager was providing an API to upload chunks
BatchManager was directly using this API when available

Some clarifications:

How it works today

The batch Manager stores the stream in temporary files.

These tmp files will be used to create Blobs that will be associated to a Document and that finally will be moved to the BinaryManager.

The final step will involve a copy of the file to write it inside the binary manager : thestoreAndDigest will read / compute digest / write the stream to a file.

With S3 BinaryManager this read/digest/write will occur on a temporary store and then the file will be copied over S3.

What we may want to improve

Client side upload is supposed to be limited by client side connectivity : S3/NAS access is likely to be faster than the http channel used by the client to upload the file.

So, we could leverage this to automatically have the batch manager write the content into the BinaryManager :

no more file duplication
no more slow write on S3 during end of transaction

Induced problems

Doing so would create several issues

BinaryManager would contain temporary streams
- this is actually not really an issue : we have GC for that
Chunking and Upload resume are a problem
- BinaryManager resolve stream according to their digest : you can not find it without a digest
- you don’t have the Digest if you did not finish the upload

There are basically 2 approaches to solve this :

Change the BinaryManager API to be able to manage chunks and temporary files
Rely on client side logic
- Make the client directly upload by it’s own means to the backend (ex: using S3 API)
- allow the Batch API to simply reference an existing Blob

Attachments

Issue Links

is required by

NXP-16950 Improve and Extend Upload API

Open

Activity

People

Assignee:

Unassigned

Reporter:

Thierry Delprat

Participants:

Thierry Delprat

Votes:

1 Vote for this issue

Watchers:

2 Start watching this issue

Dates

Created:

2015-04-16 20:56

Updated:

2024-09-17 19:57