Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-30624

Make s3SetBlobLength (previously s3SetContentLength) bulk action more reliable

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 10.10-HF53, 2021.9
    • Component/s: BlobManager, Bulk
    • Release Notes Summary:
      A new bulk action s3SetBlobLength is provided.
    • Tags:
    • Upgrade notes:
      Hide

      New bulk action is provided s3SetBlobLength it supersedes the one introduced in HF52 (s3SetContentLength).
      The bulk action is not activated by default this needs to be done explicitly from nuxeo.conf with:

      binarymanager.bulk.s3SetBlobLength.enabled=true
      

      The number of partitions (maximum concurrency at culster level) and the concurrency (per worker node) can be tuned with:

      binarymanager.bulk.s3SetBlobLength.partitions=4
      binarymanager.bulk.s3SetBlobLength.concurrency=2
      

      The action will process all blobs of a document unless an XPath filter is provided.
      The force flag will always fetch the length from s3 and the document will be updated if the length is different.

      Here an example of invocation only on the main blob (file:content) with the force flag and for all document of type File.

      curl -v -X POST -H "Content-Type: application/json" "localhost:8080/nuxeo/api/v1/search/bulk/s3SetBlobLength?query=SELECT%20*%20FROM%20File" -u Administrator:Administrator -d '{"force": true, "xpath": "content"}'
      
      Show
      New bulk action is provided s3SetBlobLength it supersedes the one introduced in HF52 (s3SetContentLength). The bulk action is not activated by default this needs to be done explicitly from nuxeo.conf with: binarymanager.bulk.s3SetBlobLength.enabled= true The number of partitions (maximum concurrency at culster level) and the concurrency (per worker node) can be tuned with: binarymanager.bulk.s3SetBlobLength.partitions=4 binarymanager.bulk.s3SetBlobLength.concurrency=2 The action will process all blobs of a document unless an XPath filter is provided. The force flag will always fetch the length from s3 and the document will be updated if the length is different. Here an example of invocation only on the main blob ( file:content ) with the force flag and for all document of type File. curl -v -X POST -H "Content-Type: application/json" "localhost:8080/nuxeo/api/v1/search/bulk/s3SetBlobLength?query=SELECT%20*%20FROM%20File" -u Administrator:Administrator -d '{ "force" : true , "xpath" : "content" }'
    • Team:
      PLATFORM
    • Sprint:
      nxplatform #45
    • Story Points:
      3

      Description

      The NXP-30573 introduces a bulk action to set content length according to s3 metadata.
      But there are still a few problems:

      • it doesn't work if the blob key already contains a blob provider prefix, this makes the blob inaccessible
      • custom listeners are invoked which slow down everything
      • some properties are updated like mixin:type
      • depending on the mime type (if unknown or binary) a thumbnail may be generated

      To avoid this we need to update the document at a lower level.

      Also, we don't want to activate this fixup action for all instances it should be enabled explicitly.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: