Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-31276

Create a Management API to extract binary fulltext

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 10.10-HF71, 2023.0, 2021.33
    • Component/s: Bulk
    • Release Notes Summary:
      There is a new Management endpoint to run binary fulltext extraction
    • Sprint:
      nxplatform #80
    • Story Points:
      3

      Description

      Currently if a customer has fulltext extraction disabled and they decide later on that they want to enable it, there is no way to do the fulltext extraction on documents (blobs) already in the db. Requesting for a fulltext bulk action to allow this.

      This ticket introduce a new management API endpoint visit the doc page for more information:
      https://doc.nuxeo.com/rest-api/1/fulltext-endpoint/

      Note that there used to be a plugin to do that https://github.com/nuxeo/nuxeo-reindex-fulltext/tree/master (repo is now marked as deprecated)
      but it will much better to create a bulk action that just run the FulltextExtractorWork inside the computation thread.

        Attachments

          Issue Links

            Activity

            Hide
            bdelbosc Benoit Delbosc added a comment - - edited

            A force option is available to nullify existing binary fulltext on document type that have been excluded by a new fulltext configuration.

            The indexing is done using the WorkManager in an optimized way, using an indexing work per transaction (bulk command batch size, 10 per default). Related proxies are also re-indexed.

            Here the expected execution traced, where we see that extractions is batched inside a transaction and indexing is also batched:

            Show
            bdelbosc Benoit Delbosc added a comment - - edited A force option is available to nullify existing binary fulltext on document type that have been excluded by a new fulltext configuration. The indexing is done using the WorkManager in an optimized way, using an indexing work per transaction (bulk command batch size, 10 per default). Related proxies are also re-indexed. Here the expected execution traced, where we see that extractions is batched inside a transaction and indexing is also batched:
            Hide
            hudson Jenkins added a comment -

            SUCCESS: Integrated in nuxeo » lts » nuxeo » 2023 #282
            NXP-31276: Bulk action to extract binary fulltext (bdelbosc: 364a9e4d620ed5bf1cb13ca9cdfb56621286816d)

            Show
            hudson Jenkins added a comment - SUCCESS: Integrated in nuxeo » lts » nuxeo » 2023 #282 NXP-31276 : Bulk action to extract binary fulltext (bdelbosc: 364a9e4d620ed5bf1cb13ca9cdfb56621286816d )
            Hide
            hudson Jenkins added a comment -

            SUCCESS: Integrated in nuxeo » lts » nuxeo » 2021 #576
            NXP-31276: Bulk action to extract binary fulltext (bdelbosc: fe37eb49392f86aa9aa2bddd73fad87074b649e1)

            Show
            hudson Jenkins added a comment - SUCCESS: Integrated in nuxeo » lts » nuxeo » 2021 #576 NXP-31276 : Bulk action to extract binary fulltext (bdelbosc: fe37eb49392f86aa9aa2bddd73fad87074b649e1 )

              People

              • Votes:
                2 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: