-
Type: New Feature
-
Status: Resolved
-
Priority: Minor
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: 7.4
-
Component/s: File Upload , Nuxeo Drive, Rest API
-
Impact type:API change
-
Upgrade notes:
-
Sprint:drive-7.4
-
Story Points:8
The new batch upload API described in NXP-16953 allows to do resumable upload.
Such an upload relies on chunking which allows:
- To have a simple resume process that does not require to be able to access a specific byte (ex: S3).
- To multiplex / parallelize the upload.
This seems to be a standard approach, as very well described in the Google Drive API documentation about Resumable Upload.
Also see http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadUploadPart.html.
Here is an example of a resumable upload.
Step 1: Initialize a resumable upload
Request:
POST /api/v1/upload/
Response:
200 OK
{"batchId": batchId}
Step 2: Save the batch id returned by the previous request
It willl be used in subsequent requests as a resumable upload session id.
Step 3: Upload a file by chunks
Request:
POST /api/upload/{batchId}/0
X-Upload-Type: "chunked"
Content-Length: chunkSize
X-Upload-Chunk-Index: x
X-Upload-Chunk-Count: 5
X-File-Name: fileName
X-File-Size: fileSize
X-File-Type: fileMimeType
About the chunk size, the Google Drive API documentation gives the following advice:
"Chunk size restriction: All chunks must be a multiple of 256 KB (256 x 1024 bytes) in size, except for the final chunk that completes the upload. If you use chunking, it is important to keep the chunk size as large as possible to keep the upload efficient."
Response: there are 3 cases here.
1. The chunk has been uploaded but the file is incomplete.
308 Resume Incomplete {"batchId": batchId, "fileIdx": "0", "uploadType": "chunked", "uploadedSize": chunkSize, "uploadedChunkId": x, "chunkCount": 5}
=> Go to Step 3 with X-Upload-Chunk-Index = index of the next chunk to upload, in most of the cases x+1.
At this point a request to know the chunk completion and determine the next chunk to upload can be made, see Step 4.
2. The chunk has been uploaded and the file is now complete.
201 Created {"batchId": batchId, "fileIdx": "0", "uploadType": "chunked", "uploadedSize": chunkSize, "uploadedChunkId": x, "chunkCount": 5}
=> End of upload.
3. The request is interrupted or you recieve HTTP 503 Service Unavailable or any other 5xx response from the server, go to Step 4.
Step 4: Resume an interrupted upload
Request:
GET /api/upload/{batchId}/0
Response: again there are 3 cases here.
1. The file is incomplete.
308 Resume Incomplete {"name": fileName, "size": fileSize, "uploadType": "chunked", "uploadedChunkIds": [0, 1, 2, 4], "chunkCount": 5}
=> Go to Step 3 with X-Upload-Chunk-Index = index of the next chunk to upload, in this case 3.
2. The file is now complete, meaning all chunks have been uploaded.
This could happen if the connection broke after all bytes were uploaded but before the client received a response from the server.
200 OK {"name": fileName, "size": fileSize, "uploadType": "chunked", "uploadedChunkIds": [0, 1, 2, 3, 4], "chunkCount": 5}
=> End of upload.
3. The request is interrupted or you recieve HTTP 503 Service Unavailable or any other 5xx response from the server, go to Step 4.
Best practices
You should follow the Best Practices advised in the Google Drive API documentation about File Upload, especially the Exponential backoff strategy.
- depends on
-
NXP-16953 Polish upload API
- Resolved
-
NXP-17885 Use TransientStore for batch upload
- Resolved
- is required by
-
NXP-18257 Allow to test GET and POST requests in BatchUploadFixture
- Resolved
-
NXDRIVE-433 Use new upload API
- Resolved
-
NXP-16950 Improve and Extend Upload API
- Open
-
NXP-18247 A few improvements for the batch upload in chunks
- Open
-
NXPY-40 Use chunks to allow resumable upload
- Resolved