Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-27388

Record management - Turn off deduplication for records

    XMLWordPrintable

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: BlobManager, Retention

      Description

      Context

      SEC-17a-4 (17 CFR § 240.17a-4 - Records to be preserved by certain exchange members, brokers and dealers.) is a US regulatory related to the records preservation.

      The main areas are related to secured storage, retention management, change and deletion prevention, legal hold, and audit trail.

       

      Prerequisite

      For the record documents storage, we will use Amazon S3 capabilities with a bucket with the following parameters:

      • Versioning turned on
      • Compliance mode turned on
      • No default retention in the bucket (or default retention as 0)

      cf. https://github.com/awsdocs/amazon-s3-developer-guide/blob/master/doc_source/object-lock-overview.md

      cf. https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lock.html

       

      User stories

      • As a broker dealer, I want to guarantee that a record is deleted once a granted user requested a deletion of the record, so that I am compliant with the legal regulation

       

      Description

      Once a document becomes a record, we propose that Nuxeo doesn't handle the deduplication anymore, for the following reasons:

      • Prevent to record several times the same document is the customer responsibility.
      • Handle the deduplication would require to handle the longest retention period (among the different documents referring to the same content) and automatically update the retain until date on S3 accordingly
      • This logic would involve that we can’t guarantee the deletion of a record in case of several documents refer to the same record with different expiration time, which would be complex to explain for the certification and later on to our prospects and customers

       

      Improvements:

      • Generate a UID of the blob based on the md5, version ID (based on version series), timestamp
      • Add a configuration to turn-on / turn-off the deduplication

        

      Acceptance criteria

      • When I create 2 documents with the exact same content file (with same md5 digest), there are 2 different blobs stored on S3

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                jaubenque Julien Aubenque
                Reporter:
                jaubenque Julien Aubenque
                Participants:
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: