Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-30473

Ability to use a BlackList to control when the Binary Searchable Text Extraction Occurs

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Binary Metadata

      Description

      The Binary Searchable Text Extraction is a sub-process of the overall Full Text Extraction process that is initiated during data import. 

       

      There is already a Blacklist/Whitelist that is used for the Overall Full Text Extraction process that uses DocType or Facet Values, but there isn't one specifically for Binary Searchable Text Extraction SubProcess.

       

      Goal -> Have a document full text indexed (metadata, technical data, etc) but NOT have the Binary Searchable Text extracted for specific Documents (Nuxeo DocType or Facet Driven)

       

      Purpose -> Increase Import Speed bc an additional process (Binary Searchable Text Extraction) is NOT performed and Reduce space used in the Database and ElasticSearch since the Binary Searchable Text is never gathered or stored for the specified list of Nuxeo DocTypes/Facet Values

       

      AC:

      Be able to provide a configured Blacklist that will allow for Full Text Extraction but NOT Binary Searchable Text Extraction for the specified list of Nuxeo DocTypes or Facet Values

       

       

       

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              rrowles Randy Rowles
              Participants:
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: