Affects Version/s: 10.10
The current algorithm is doing the following steps:
- run a query to get all the children of a given section (without the hidden docs and the publish spaces)
- all the DocumentModels which match the previous query are loaded into memory : if a section contains 10 thousands (or more) of published documents in the section, it may consume all of the available heap with 3 or 4 concurrent threads.
- it loops over all (published) documents to find which ones are a published version of a given document
- then the matching documents are deleted/unpublished
Iterating over all the children of a section could be avoided by leveraging an existing property, ecm:proxyVersionableId, which stores the id of the source document which was published to this section.
Then a query like
would be more efficient and load only the required documentModel into memory.
NB: not sure which property is the best between proxyVersionableId and versionVersionableId