-
Type: Improvement
-
Status: Resolved
-
Priority: Major
-
Resolution: Fixed
-
Affects Version/s: None
-
Release Notes Summary:A new Full Garbage Collector is available to clean up orphaned document blobs and is exposed in the management Rest API
-
Release Notes Description:
-
Epic Link:
-
Team:PLATFORM
-
Sprint:nxplatform #87
-
Story Points:5
The orphan binaries GC can take a long time (days) when the Nuxeo Platform is storing a large number of binaries (millions) in its file storage.
Even though orphan binaries GC is considered a maintenance operation, some users need to run it often e.g. when the Nuxeo Platform is used as a temporary storage therefore does not need a large file storage.
Several improvements can be provided depending on the user's need:
- the initial list of digests retrieved from the database could be splitted and each part execute in its own thread,
- make the orphan binaries GC synchronous (see
NXP-28523) (interesting when the Nuxeo Platform is used as a temporary storage), - maintain a reverse index referencing the document(s) using the binary
EDIT
With NXP-31737, we'll be able to scroll the blob stores of each provider of each repository. Adding a BAF on top of this that leverages the new APIs available since NXP-31594 will offer a scalable and resilient Full GC.
For the record, the `defaultConcurrency` and `defaultPartitions` of BAF's processor can be customized to speed up the Full GC process.
Note that instances populated with data prior to NXP-29516 will need NXP-30070 (i.e. ecm:blobKeys capability)
- causes
-
NXP-31871 Rest endpoints now return 501 HTTP status code (Not Implemented) on UnsupportedOperationException
- Resolved
- depends on
-
NXP-30070 Migration to new denormalized ecm:blobKeys
- Resolved
- is related to
-
NXDOC-2565 Document new Orphaned Blobs GC endpoint
- Resolved
-
NXP-28523 Make it possible to delete the binary as soon as the associated document(s) are permanently deleted
- Resolved
-
NXP-30975 Don't swallow exception when transaction timeout in GC
- Resolved
-
NXP-31630 Garbage Collector on Demand
- Resolved
-
NXP-32112 Fix BulkStatus result map merge overflow on numbers
- Resolved
-
NXP-32204 Provide a migration to re-apply custom blob dispatcher rules
- Open
-
NXP-31594 Clean up orphan binaries after document removal, blob property edition and dispatch
- Resolved
-
NXP-28679 Improve blob GC to do cleanup for selected blobs
- Resolved
-
NXP-31876 Make LocalBlobProvider the default blob provider implementation
- Resolved
-
NXP-31904 Provide better default partitions for bulk actions so they can scale
- Resolved
-
NXP-32302 Add Full GC unit test on S3 configured with a sub directory depth
- Resolved
- is required by
-
NXP-31805 Orphan file deletion in Nuxeo
- Open