This is a continuation of
NXP-29404 but we face a different barrier.
When you upload a file to S3, there are heuristics in the client that determine whether the upload is done is one go or if the file is split into chunks. Any multipart upload to S3 lacks an MD5 hash (which Nuxeo normally uses as the key to access blobs and avoid duplication).
Since the threshold for a multipart upload is lower for an upload than for a copy we currently do an extra copy, once we know a file has been uploaded in chunks, to force S3 to compute the MD5. This however does not work for files greater than 5GB since in this case even an S3 to S3 copy will use chunks, and thus we won't have a ready-to-use MD5. For files greater than 5GB there’s no way to make S3 compute the digest of the file as a whole.