-
Type: Bug
-
Status: Resolved
-
Priority: Blocker
-
Resolution: Fixed
-
Affects Version/s: 2021.35, 2023.7
-
Component/s: BlobManager, Importer
-
Release Notes Summary:Document Blob Garbage Collection always check for prefixed and unprefixed default provider blob keys.
-
Tags:
-
Team:PLATFORM
-
Sprint:nxplatform #106
-
Story Points:3
With a basic repository blob provider configuration, we only have one default blob provider. All blob key references persisted in the db backend are formed with the digest only (e.g. "25625314fd5a92388b9de8abc786b1c0")
Nevertheless, when importing documents using the CoreSession#importDocuments method, it is possible to persist the blob key with the blob provider prefix (e.g. "default:25625314fd5a92388b9de8abc786b1c0" and everything works fine so far.
However, it has an impact on the New GC introduced by NXP-31594, when deleting a document referencing a blob "25625314fd5a92388b9de8abc786b1c0", the algorithm does not detect that another document references this blob under the "default:25625314fd5a92388b9de8abc786b1c0" reference and the blob is deleted.
See this unit test that highlights the issue.
We probably must consider:
- blob key sanitizing when creating documents through CoreSession#importDocuments
and/or - evolve the Full GC algorithm to detect both forms of blob keys. (Maybe we can keep current behavior enabled with a system property that tell us to be as fast as possible)
EDIT
If on a given deployment configured with a single provider gone on production and being populated with data (blobs), an additional blob provider was added (with a blob dispatcher contribution) later, both prefixed and unprefixed blob references can exist in the repository and we'll have the problem described above.
Let's evolve GC algo to check both prefixed and unprefixed references for default provider