Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-28997

Fulltext extractor should not NPE on missing blob

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 10.10-HF25
    • Fix Version/s: None
    • Component/s: BlobManager
    • Environment:
      MongoDB + Minio (S3 compliant) + Elasticsearch not embeded
    • Team:
      PLATFORM

      Description

      In the context of Nuxeo using S3 + MongoDB, if I create a BlobInfo ex nihilo, attach it to a document, and that binary is not stored yet in the binary manager, the fulltext extractor starts and a NullPointerException is thrown.

      Expected behaviour:  If there's no blob, it should just be ignored in FulltextExtractorWork.blobToText

      See the logs attached.
      More infos, the blob provider used is:

        <extension target="org.nuxeo.ecm.core.blob.BlobManager" point="configuration">
          <blobprovider name="default">
            <class>org.nuxeo.ecm.core.storage.sql.S3BinaryManager</class>
      

      As a workaround when changing the default binary manager in the nuxeo.conf:

      nuxeo.core.binarymanager=org.nuxeo.ecm.core.blob.LocalBlobProvider
      

      You have a lighter error message:

      2020-04-24T17:07:54,807 ERROR [Nuxeo-Work-default-3:45529565829287.1037730276] [org.nuxeo.ecm.core.blob.ManagedBlob] Failed to access file: default:543e04e2af1a42cdfed52ba8ab614956
      2020-04-24T17:07:54,807 ERROR [Nuxeo-Work-default-2:45529565872414.443371692] [org.nuxeo.ecm.core.blob.ManagedBlob] Failed to access file: default:f8b816d0fc23890a05ce543d85e6b695
      2020-04-24T17:07:54,807 ERROR [Nuxeo-Work-default-4:45529565821522.1831279274] [org.nuxeo.ecm.core.blob.ManagedBlob] Failed to access file: default:e2c2ebe69c294cda688aaec2b01acfba
      2020-04-24T17:07:54,856 WARN  [Nuxeo-Work-default-4:45529565821522.1831279274] [org.nuxeo.ecm.core.storage.FulltextExtractorWork] Could not extract fulltext of file 'my-pdf-0.pdf' for document: f8b4d96f-4364-4fae-b2b2-25a413083a20: org.nuxeo.ecm.core.convert.api.ConversionException: Error while converting via CommandLineService
      2020-04-24T17:07:54,856 WARN  [Nuxeo-Work-default-2:45529565872414.443371692] [org.nuxeo.ecm.core.storage.FulltextExtractorWork] Could not extract fulltext of file 'my-pdf-1.pdf' for document: 78d3c709-2bbb-434e-a46c-203d7d8db308: org.nuxeo.ecm.core.convert.api.ConversionException: Error while converting via CommandLineService
      2020-04-24T17:07:54,856 WARN  [Nuxeo-Work-default-3:45529565829287.1037730276] [org.nuxeo.ecm.core.storage.FulltextExtractorWork] Could not extract fulltext of file 'my-pdf-2.pdf' for document: bb659c17-c9a9-4466-96d6-afe2a8984b8f: org.nuxeo.ecm.core.convert.api.ConversionException: Error while converting via CommandLineService
      

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: