-
Type: Improvement
-
Status: Resolved
-
Priority: Minor
-
Resolution: Fixed
-
Affects Version/s: 4.0.4
-
Fix Version/s: 4.1.4
-
Component/s: Synchronizer
-
Epic Link:
-
Sprint:nxDrive 11.1.13
-
Story Points:2
Status & Problem
The current upload process is:
- upload all chunks;
- then call the NuxeoDrive.CreateFile operation to link the uploaded blob to a given document.
If the 2nd part fails, the entire upload will be restarted from zero. This is problematic with big files (even worse if the network is bad).
We have seen an example where the 2nd part failed even if the operation was a success. So Drive restarted the upload. FTR this was this error:
Traceback (most recent call last): File "nxdrive/engine/processor.py", line 280, in _execute File "nxdrive/engine/processor.py", line 749, in _synchronize_locally_created File "nxdrive/client/remote_client.py", line 553, in stream_file File "nxdrive/client/remote_client.py", line 379, in upload File "nxdrive/client/remote_client.py", line 149, in execute File "nxdrive/client/remote_client.py", line 145, in execute File "site-packages/nuxeo/operations.py", line 201, in execute File "site-packages/nuxeo/client.py", line 209, in request nuxeo.exceptions.HTTPError: HTTPError(502), error: b'<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>502 Proxy Error</title>\n</head><body>\n<h1>Proxy Error</h1>\n<p>The proxy server received an invalid\r\nresponse from an upstream server.<br />\r\nThe proxy server could not handle the request <em><a href="/nuxeo/api/v1/upload/batchId-BATCHID/0/execute/NuxeoDrive.CreateFile">POST /nuxeo/api/v1/upload/batchId-BATCHID/0/execute/NuxeoDrive.CreateFile</a></em>.<p>\nReason: <strong>Error reading from remote server</strong></p></p></body></html>\n', server trace: None 2019-07-05 13:44:37 700 123145313792000 DEBUG nxdrive.engine.processor Postpone action on document(Server unavailable): <DocPair...>
Improvement 1
The idea is to separate completely the upload process:
- If the 1st part is OK, continue; else retry as it is the current behavior.
- If an error happens at the 2nd part:
- If the file exists on the server, this means the error is only after the operation was done and we can say it is OK.
- If the file does not exist on the server, it means there is an actual error, put the document in error as this is the current behavior. If it was a temporary error (network, ... ) the upload will be retried and if the chunk TTL is large enough, no upload would be done. Else a new upload will be done.
Improvement 2
Currently, the part 2 use the part 1 upload duration to calculate the Nuxeo-Transaction-Timeout for the operation (duration * 2 seconds). This is not good as for resumable uploads, one may have paused a big upload (20GiB) at the nearly end; and at the resume, the upload duration will be far from the real (total) value.
Another way of computing this timeout is needed.
- is related to
-
NXDRIVE-1743 Fail to upload large files through Drive
- Resolved
-
NXPY-112 Update uploadedSize on each and every upload iteration
- Resolved
-
NXDRIVE-1714 Handle files transfers above 100 GB
- Resolved