Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-24384

multipart/form-data file upload with zero copy

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 9.10
    • Fix Version/s: 10.2
    • Component/s: File Upload
    • Release Notes Description:
      Hide

      Optimized multipart/form upload

      When uploading content to Nuxeo using the multi-part/form-data way, no useless copy is made on the way, optimizing drastically the upload performance with large videos when using this upload method.

      Show
      Optimized multipart/form upload When uploading content to Nuxeo using the multi-part/form-data way, no useless copy is made on the way, optimizing drastically the upload performance with large videos when using this upload method.
    • Sprint:
      nxcore 10.1.4
    • Story Points:
      5
    • Epic Link:

      Description

      When uploading a file with the Rest API, after getting a batchid, the curl command from the documentation is done with a curl -F file=@myfile.doc. This option is doing a multipart/form-data upload that creates multiple copies of the uploaded files, this should be avoided.

      Note that the non multipart upload is already optimized to create a single file in the transient store (this is what is used by the web UI upload).

      1. The uploaded file is saved in nuxeo tmp directory (tmp/upload_c606abe1_b412_4570_a1d3_91caa5197c71_00000000.tmp)

      "http-nio-0.0.0.0-8080-exec-1" #139 daemon prio=5 os_prio=0 tid=0x00007fed146b6000 nid=0x3ae9 waiting on condition [0x00007fec81252000]
         java.lang.Thread.State: TIMED_WAITING (parking)
              at sun.misc.Unsafe.park(Native Method)
              - parking to wait for  <0x0000000606714200> (a java.util.concurrent.CountDownLatch$Sync)
              at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
              at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
              at org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.awaitLatch(NioEndpoint.java:1114)
              at org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.awaitReadLatch(NioEndpoint.java:1116)
              at org.apache.tomcat.util.net.NioBlockingSelector.read(NioBlockingSelector.java:184)
              at org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:235)
              at org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:216)
              at org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.fillReadBuffer(NioEndpoint.java:1241)
              at org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.read(NioEndpoint.java:1190)
              at org.apache.coyote.http11.Http11InputBuffer.fill(Http11InputBuffer.java:717)
              at org.apache.coyote.http11.Http11InputBuffer.access$300(Http11InputBuffer.java:40)
              at org.apache.coyote.http11.Http11InputBuffer$SocketInputBuffer.doRead(Http11InputBuffer.java:1072)
              at org.apache.coyote.http11.filters.IdentityInputFilter.doRead(IdentityInputFilter.java:140)
              at org.apache.coyote.http11.Http11InputBuffer.doRead(Http11InputBuffer.java:261)
              at org.apache.coyote.Request.doRead(Request.java:581)
              at org.apache.catalina.connector.InputBuffer.realReadBytes(InputBuffer.java:326)
              at org.apache.catalina.connector.InputBuffer.checkByteBufferEof(InputBuffer.java:642)
              at org.apache.catalina.connector.InputBuffer.read(InputBuffer.java:349)
              at org.apache.catalina.connector.CoyoteInputStream.read(CoyoteInputStream.java:183)
              at org.apache.commons.fileupload.MultipartStream$ItemInputStream.makeAvailable(MultipartStream.java:999)
              at org.apache.commons.fileupload.MultipartStream$ItemInputStream.read(MultipartStream.java:903)
              at java.io.InputStream.read(InputStream.java:101)
              at org.apache.commons.fileupload.util.Streams.copy(Streams.java:100)
              at org.apache.commons.fileupload.util.Streams.copy(Streams.java:70)
              at org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:347)
              at org.nuxeo.ecm.webengine.forms.FormData.getMultiPartItems(FormData.java:160)
              at org.nuxeo.ecm.webengine.forms.FormData.getFirstBlob(FormData.java:211)
              at org.nuxeo.ecm.restapi.server.jaxrs.BatchUploadObject.uploadNoTransaction(BatchUploadObject.java:175)
              at org.nuxeo.ecm.restapi.server.jaxrs.BatchUploadObject.upload(BatchUploadObject.java:121)
      

      2. The file is then copied as tmp/nxblob-6726070556633230192.tmp

      "http-nio-0.0.0.0-8080-exec-1" #139 daemon prio=5 os_prio=0 tid=0x00007fed146b6000 nid=0x3ae9 runnable [0x00007fec81252000]
         java.lang.Thread.State: RUNNABLE
              at java.io.FileInputStream.readBytes(Native Method)
              at java.io.FileInputStream.read(FileInputStream.java:233)
              at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2146)
              at org.apache.commons.io.IOUtils.copy(IOUtils.java:2102)
              at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2123)
              at org.apache.commons.io.IOUtils.copy(IOUtils.java:2078)
              at org.nuxeo.ecm.core.api.impl.blob.FileBlob.<init>(FileBlob.java:136)
              at org.nuxeo.ecm.core.api.Blobs.createBlob(Blobs.java:111)
              at org.nuxeo.ecm.webengine.forms.FormData.getBlob(FormData.java:231)
              at org.nuxeo.ecm.webengine.forms.FormData.getFirstBlob(FormData.java:215)
              at org.nuxeo.ecm.restapi.server.jaxrs.BatchUploadObject.uploadNoTransaction(BatchUploadObject.java:175)
              at org.nuxeo.ecm.restapi.server.jaxrs.BatchUploadObject.upload(BatchUploadObject.java:121)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      

      3. A second copy is visible tmp/nxblob-8914819807325678022.tmp

      "http-nio-0.0.0.0-8080-exec-1" #139 daemon prio=5 os_prio=0 tid=0x00007fed146b6000 nid=0x3ae9 runnable [0x00007fec81252000]
         java.lang.Thread.State: RUNNABLE
              at java.io.FileInputStream.readBytes(Native Method)
              at java.io.FileInputStream.read(FileInputStream.java:233)
              at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2146)
              at org.apache.commons.io.IOUtils.copy(IOUtils.java:2102)
              at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2123)
              at org.apache.commons.io.IOUtils.copy(IOUtils.java:2078)
              at org.nuxeo.ecm.core.api.impl.blob.FileBlob.<init>(FileBlob.java:136)
              at org.nuxeo.ecm.core.api.Blobs.createBlob(Blobs.java:111)
              at org.nuxeo.ecm.automation.server.jaxrs.batch.Batch.addFile(Batch.java:168)
              at org.nuxeo.ecm.automation.server.jaxrs.batch.BatchManagerComponent.addStream(BatchManagerComponent.java:127)
              at org.nuxeo.ecm.restapi.server.jaxrs.BatchUploadObject.addStream(BatchUploadObject.java:244)
              at org.nuxeo.ecm.restapi.server.jaxrs.BatchUploadObject.uploadNoTransaction(BatchUploadObject.java:188)
              at org.nuxeo.ecm.restapi.server.jaxrs.BatchUploadObject.upload(BatchUploadObject.java:121)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      

      4. The the file is moved to transient store data/transientstores/default/YmF0Y2hJZC00NmQyZDNjNS00YThmLTQwMzYtOWQyOC1mNTQyMGM2YjEzYzZfMA==/6250030040361346133.tmp

      "http-nio-0.0.0.0-8080-exec-1" #139 daemon prio=5 os_prio=0 tid=0x00007fed146b6000 nid=0x3ae9 runnable [0x00007fec81252000]
         java.lang.Thread.State: RUNNABLE
              at sun.nio.fs.UnixCopyFile.transfer(Native Method)
              at sun.nio.fs.UnixCopyFile.copyFile(UnixCopyFile.java:251)
              at sun.nio.fs.UnixCopyFile.copy(UnixCopyFile.java:581)
              at sun.nio.fs.UnixFileSystemProvider.copy(UnixFileSystemProvider.java:253)
              at java.nio.file.Files.copy(Files.java:1274)
              at org.nuxeo.ecm.core.api.impl.blob.FileBlob.moveTo(FileBlob.java:201)
              at org.nuxeo.ecm.core.transientstore.AbstractTransientStore.storeBlobs(AbstractTransientStore.java:158)
              at org.nuxeo.ecm.core.transientstore.AbstractTransientStore.putBlobs(AbstractTransientStore.java:137)
              at org.nuxeo.ecm.automation.server.jaxrs.batch.Batch.addFile(Batch.java:174)
              at org.nuxeo.ecm.automation.server.jaxrs.batch.BatchManagerComponent.addStream(BatchManagerComponent.java:127)
              at org.nuxeo.ecm.restapi.server.jaxrs.BatchUploadObject.addStream(BatchUploadObject.java:244)
              at org.nuxeo.ecm.restapi.server.jaxrs.BatchUploadObject.uploadNoTransaction(BatchUploadObject.java:188)
              at org.nuxeo.ecm.restapi.server.jaxrs.BatchUploadObject.upload(BatchUploadObject.java:121)
      

      When calling the create document api:
      1. the 2 tmp files are copied to tmp/nxbincache.4471158360387862083/ and source files are deleted

      "http-nio-0.0.0.0-8080-exec-5" #143 daemon prio=5 os_prio=0 tid=0x00007fed146ab800 nid=0x3aed runnable [0x00007fec80e4e000]
         java.lang.Thread.State: RUNNABLE
              at java.io.FileInputStream.readBytes(Native Method)
              at java.io.FileInputStream.read(FileInputStream.java:233)
              at org.nuxeo.ecm.core.blob.binary.AbstractBinaryManager.storeAndDigest(AbstractBinaryManager.java:154)
              at org.nuxeo.ecm.core.blob.binary.CachingBinaryManager.getBinary(CachingBinaryManager.java:138)
              at org.nuxeo.ecm.core.blob.binary.AbstractBinaryManager.getBinary(AbstractBinaryManager.java:97)
              at org.nuxeo.ecm.core.blob.binary.BinaryBlobProvider.writeBlob(BinaryBlobProvider.java:109)
              at org.nuxeo.ecm.blob.AbstractCloudBinaryManager.writeBlob(AbstractCloudBinaryManager.java:139)
              at org.nuxeo.ecm.core.blob.DocumentBlobManagerComponent.writeBlob(DocumentBlobManagerComponent.java:179)
              at org.nuxeo.ecm.core.storage.BaseDocument.setValueBlob(BaseDocument.java:644)
              at org.nuxeo.ecm.core.storage.BaseDocument$BlobWriteContext.flush(BaseDocument.java:756)
              at org.nuxeo.ecm.core.api.DocumentModelFactory.writeDocumentModel(DocumentModelFactory.java:234)
              at org.nuxeo.ecm.core.api.AbstractSession.writeModel(AbstractSession.java:371)
              at org.nuxeo.ecm.core.api.AbstractSession.createDocument(AbstractSession.java:718)
              at org.nuxeo.ecm.restapi.server.jaxrs.JSONDocumentObject.doPost(JSONDocumentObject.java:105)
      

      then the file are transfered to s3

      "http-nio-0.0.0.0-8080-exec-5" #143 daemon prio=5 os_prio=0 tid=0x00007fed146ab800 nid=0x3aed waiting on condition [0x00007fec80e4e000]
         java.lang.Thread.State: WAITING (parking)
              at sun.misc.Unsafe.park(Native Method)
              - parking to wait for  <0x00000007b4396f68> (a java.util.concurrent.FutureTask)
              at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
              at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
              at java.util.concurrent.FutureTask.get(FutureTask.java:191)
              at com.amazonaws.services.s3.transfer.internal.UploadImpl.waitForUploadResult(UploadImpl.java:62)
              at org.nuxeo.ecm.core.storage.sql.S3BinaryManager$S3FileStorage.storeFile(S3BinaryManager.java:458)
              at org.nuxeo.ecm.core.blob.binary.CachingBinaryManager.getBinary(CachingBinaryManager.java:154)
              at org.nuxeo.ecm.core.blob.binary.AbstractBinaryManager.getBinary(AbstractBinaryManager.java:97)
              at org.nuxeo.ecm.core.blob.binary.BinaryBlobProvider.writeBlob(BinaryBlobProvider.java:109)
              at org.nuxeo.ecm.blob.AbstractCloudBinaryManager.writeBlob(AbstractCloudBinaryManager.java:139)
              at org.nuxeo.ecm.core.blob.DocumentBlobManagerComponent.writeBlob(DocumentBlobManagerComponent.java:179)
              at org.nuxeo.ecm.core.storage.BaseDocument.setValueBlob(BaseDocument.java:644)
              at org.nuxeo.ecm.core.storage.BaseDocument$BlobWriteContext.flush(BaseDocument.java:756)
              at org.nuxeo.ecm.core.api.DocumentModelFactory.writeDocumentModel(DocumentModelFactory.java:234)
              at org.nuxeo.ecm.core.api.AbstractSession.writeModel(AbstractSession.java:371)
              at org.nuxeo.ecm.core.api.AbstractSession.createDocument(AbstractSession.java:718)
      

      Idealy we should have only one temp file created at upload and 0 copy.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 0 minutes
                  0m
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 2 hours
                  2h