Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-17885

Use TransientStore for batch upload

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 7.4
    • Impact type:
      API change
    • Upgrade notes:
      Hide

      Added:

      • BatchManager#getTransientStore()
      • BatchManager#initBatch()
      • BatchManager#addStream(String batchId, String idx, InputStream is, int chunkCount, int chunkIdx, String name, String mime, long fileSize)
      • Batch#addChunk(String idx, InputStream is, int chunkCount, int chunkIdx, String name, String mime, long fileSize)
      • BatchFileEntry
      • BatchChunkEntry

      Changed:

      • Batch extends AbstractStorageEntry
      • Batch#clear() to Batch#clean()
      Show
      Added: BatchManager#getTransientStore() BatchManager#initBatch() BatchManager#addStream(String batchId, String idx, InputStream is, int chunkCount, int chunkIdx, String name, String mime, long fileSize) Batch#addChunk(String idx, InputStream is, int chunkCount, int chunkIdx, String name, String mime, long fileSize) BatchFileEntry BatchChunkEntry Changed: Batch extends AbstractStorageEntry Batch#clear() to Batch#clean()

      Description

      Main principle

      Now the BatchManager relies on the TransientStore to allow several implementations among which the Redis one that is cluster aware.
      Indeed, upload data (structure and streams) must be shared across Nuxeo nodes if we want the upload system to work across the cluster without having to enforce affinity. This is required by NXP-17780.

      That's why the Batch object now fits in a StorageEntry which is the main object manipulated by the TransientStore.
      The way we implemented the storage also allows to take into account chunking in the data struture maintained by the transient store as this is required by NXP-16951.

      Example of storage of a batch with 2 files among which one of them is made of 2 chunks

      1. The Batch is stored in the "default" transient store with its id as a key.
        It has no blobs but references in its parameters the keys of each file in the batch, stored as BatchFileEntry objects in the same transient store.
        TransientStore("default") -> {"batchId-a0dbccda-a36c-436d-8de6-09fe96f14e08": batch (StorageEntry)}
        
        batch ->
            - blobs = []
            - params = {"0": "batchId-a0dbccda-a36c-436d-8de6-09fe96f14e08_0", "1": "batchId-a0dbccda-a36c-436d-8de6-09fe96f14e08_1"}
        
      2. Each BatchFileEntry is indeed stored in the "default" transient store with the file index concatenated to the batch id as a key.
        A file that is not chunked directly references its blob in the blob list of the StorageEntry, a file that is chunked references in its parameters the keys of each chunk, stored as BatchChunkEntry objects in the same transient store.
        TransientStore("default") -> {"batchId-a0dbccda-a36c-436d-8de6-09fe96f14e08_0": batchFileEntry0 (StorageEntry)}
        TransientStore("default") -> {"batchId-a0dbccda-a36c-436d-8de6-09fe96f14e08_1": batchFileEntry1 (StorageEntry)}
        
        batchFileEntry0 ->
            - blobs = [blob]
            - params = {"chunked": false}
        
        batchFileEntry1 ->
            - blobs = []
            - params = {"chunked": true, "fileName": "My file.txt", "mimeType": "text/plain", "fileSize": 1024, "chunkCount": 2, "chunks": {0: "batchId-a0dbccda-a36c-436d-8de6-09fe96f14e08_1_0", 1: "batchId-a0dbccda-a36c-436d-8de6-09fe96f14e08_1_1"}}
        
      3. Each BatchChunkEntry is indeed stored in the "default" transient store with the chunk index concatenated to the file key as a key.
        A chunk directly references its blob in the blob list of the StorageEntry and has no parameters.
        TransientStore("default") -> {"batchId-a0dbccda-a36c-436d-8de6-09fe96f14e08_1_0": batchChunkEntry0 (StorageEntry)}
        TransientStore("default") -> {"batchId-a0dbccda-a36c-436d-8de6-09fe96f14e08_1_1": batchChunkEntry1 (StorageEntry)}
        
        batchChunkEntry0 ->
            - blobs = [chunk0]
            - params = {}
        
        batchChunkEntry1 ->
            - blobs = [chunk1]
            - params = {}
        

      Adding a file or a chunk to a batch with the BatchManager

      1. First you need to initialize a batch by calling

      BatchManager#init()
      

      which returns the batch id.
      2. Then add a whole file to the batch:

      BatchManager#addStream(String batchId, String idx, InputStream is, String name, String mime)
      

      or add a chunk to the given batch file:

      BatchManager#addStream(String batchId, String idx, InputStream is, int chunkCount, int chunkIdx, String name, String mime, long fileSize)
      

      3. To get the blob of a given file in the batch just call

      BatchManager#getBlob(String batchId, String fileId)
      

      This will return the file blob by eventually concatenating the file chunks if the file is made of chunks.
      4. The batch can be cleaned by calling

      BatchManager#clean(String batchId)
      

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  PagerDuty

                  Error rendering 'com.pagerduty.jira-server-plugin:PagerDuty'. Please contact your Jira administrators.