Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-32308

Fix Garbage Collection when default blob provider blob keys can be both un/prefixed

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 2021.35, 2023.7
    • Fix Version/s: 2021.50, 2023.8
    • Component/s: BlobManager, Importer
    • Release Notes Summary:
      Document Blob Garbage Collection always check for prefixed and unprefixed default provider blob keys.
    • Tags:
    • Team:
      PLATFORM
    • Sprint:
      nxplatform #106
    • Story Points:
      3

      Description

      With a basic repository blob provider configuration, we only have one default blob provider. All blob key references persisted in the db backend are formed with the digest only (e.g. "25625314fd5a92388b9de8abc786b1c0")

      Nevertheless, when importing documents using the CoreSession#importDocuments method, it is possible to persist the blob key with the blob provider prefix (e.g. "default:25625314fd5a92388b9de8abc786b1c0" and everything works fine so far.

      However, it has an impact on the New GC introduced by NXP-31594, when deleting a document referencing a blob "25625314fd5a92388b9de8abc786b1c0", the algorithm does not detect that another document references this blob under the "default:25625314fd5a92388b9de8abc786b1c0" reference and the blob is deleted.

      See this unit test that highlights the issue.

      We probably must consider:

      • blob key sanitizing when creating documents through CoreSession#importDocuments
        and/or
      • evolve the Full GC algorithm to detect both forms of blob keys. (Maybe we can keep current behavior enabled with a system property that tell us to be as fast as possible)

      EDIT

      If on a given deployment configured with a single provider gone on production and being populated with data (blobs), an additional blob provider was added (with a blob dispatcher contribution) later, both prefixed and unprefixed blob references can exist in the repository and we'll have the problem described above.

        Attachments

          Issue Links

            Activity

            Hide
            grenard Guillaume Renard added a comment - - edited

            Let's evolve GC algo to check both prefixed and unprefixed references for default provider

            Show
            grenard Guillaume Renard added a comment - - edited Let's evolve GC algo to check both prefixed and unprefixed references for default provider
            Hide
            hudson Jenkins added a comment -

            NOT_BUILT: Integrated in nuxeo » lts » nuxeo » 2021 #895
            NXP-32308: Fix GC when default provider blob keys are both (un)prefixed (guirenard: 2e30b090b01f17c28261b461716e5ce3fb2ffa53)

            Show
            hudson Jenkins added a comment - NOT_BUILT: Integrated in nuxeo » lts » nuxeo » 2021 #895 NXP-32308 : Fix GC when default provider blob keys are both (un)prefixed (guirenard: 2e30b090b01f17c28261b461716e5ce3fb2ffa53 )
            Hide
            hudson Jenkins added a comment -

            NOT_BUILT: Integrated in nuxeo » lts » nuxeo » 2023 #796
            NXP-32308: Fix GC when default provider blob keys are both (un)prefixed (guirenard: 815d3d4692d67a9897b58a1a3225e1ed886d30bb)

            Show
            hudson Jenkins added a comment - NOT_BUILT: Integrated in nuxeo » lts » nuxeo » 2023 #796 NXP-32308 : Fix GC when default provider blob keys are both (un)prefixed (guirenard: 815d3d4692d67a9897b58a1a3225e1ed886d30bb )

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: