Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-30543

Have an option to prevent blob fetching during indexing

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Minor
    • Resolution: Won't Fix
    • Affects Version/s: 10.10
    • Fix Version/s: None
    • Component/s: Elasticsearch
    • Upgrade notes:
      Hide


      Note that the number of partitions is only taken into account when creating Kafka's topic.

      Show
      Note that the number of partitions is only taken into account when creating Kafka's topic.
    • Team:
      PLATFORM
    • Sprint:
      nxplatform #42, nxplatform next
    • Story Points:
      3

      Description

      To ensure a fast indexing processing we don't want to access the binary store to build the elastic JSON representation.
      This is normally the case except when blob metadata are missing (like blob length) which is an existing use case on custom mass import.

      When this happens the binary store becomes a bottleneck impacting indexing throughput,
      for instance, when using a S3 binary store we see the following stacks:

      "indexPool-05,in:50,inCheckpoint:50,out:314,lastRead:1628196554016,lastTimer:0,wm:213410977800650753,loop:112368,checkpoint" #175 prio=5 os_prio=0 tid=0x00007fbc42598800 nid=0x1a3 waiting on condition [0x00007fbb09ddb000]
         java.lang.Thread.State: WAITING (parking)
      	at sun.misc.Unsafe.park(Native Method)
      	- parking to wait for  <0x00000006fec02a80> (a java.util.concurrent.FutureTask)
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
      	at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
      	at java.util.concurrent.FutureTask.get(FutureTask.java:191)
      	at com.amazonaws.services.s3.transfer.internal.AbstractTransfer.waitForCompletion(AbstractTransfer.java:100)
      	at org.nuxeo.ecm.core.storage.sql.S3BinaryManager$S3FileStorage.fetchFile(S3BinaryManager.java:722)
      	at org.nuxeo.ecm.core.blob.binary.CachingBinaryManager.getFile(CachingBinaryManager.java:182)
      	at org.nuxeo.ecm.core.blob.binary.LazyBinary.getFile(LazyBinary.java:67)
      	at org.nuxeo.ecm.core.blob.binary.BinaryBlobProvider.readBlob(BinaryBlobProvider.java:111)
      	at org.nuxeo.ecm.blob.AbstractCloudBinaryManager.readBlob(AbstractCloudBinaryManager.java:140)
      	at org.nuxeo.ecm.core.blob.BlobProvider.readBlob(BlobProvider.java:118)
      	at org.nuxeo.ecm.core.blob.DocumentBlobManagerComponent.readBlob(DocumentBlobManagerComponent.java:147)
      	at org.nuxeo.ecm.core.blob.DocumentBlobManagerComponent.readBlob(DocumentBlobManagerComponent.java:134)
      	at org.nuxeo.ecm.core.storage.BaseDocument.getValueBlob(BaseDocument.java:518)
      	at org.nuxeo.ecm.core.storage.BaseDocument.readComplexProperty(BaseDocument.java:719)
      	at org.nuxeo.ecm.core.storage.BaseDocument.readComplexProperty(BaseDocument.java:735)
      	at org.nuxeo.ecm.core.storage.BaseDocument.readComplexProperty(BaseDocument.java:709)
      	at org.nuxeo.ecm.core.storage.dbs.DBSDocument.readDocumentPart(DBSDocument.java:1069)
      	at org.nuxeo.ecm.core.api.DocumentModelFactory.createDataModel(DocumentModelFactory.java:198)
      	at org.nuxeo.ecm.core.api.AbstractSession.getDataModel(AbstractSession.java:2040)
      	at org.nuxeo.ecm.core.api.impl.DocumentModelImpl.loadDataModel(DocumentModelImpl.java:439)
      	at org.nuxeo.ecm.core.api.impl.DocumentModelImpl.getDataModel(DocumentModelImpl.java:449)
      	at org.nuxeo.ecm.core.api.impl.DocumentModelImpl.getPart(DocumentModelImpl.java:1215)
      	at org.nuxeo.ecm.core.api.impl.DocumentModelImpl.getPropertyObjects(DocumentModelImpl.java:1241)
      	at org.nuxeo.elasticsearch.io.JsonESDocumentWriter.writeProperties(JsonESDocumentWriter.java:203)
      	at org.nuxeo.elasticsearch.io.JsonESDocumentWriter.writeSchemas(JsonESDocumentWriter.java:175)
      	at org.nuxeo.elasticsearch.io.JsonESDocumentWriter.writeESDocument(JsonESDocumentWriter.java:195)
      	at org.nuxeo.elasticsearch.core.ElasticSearchIndexingImpl.source(ElasticSearchIndexingImpl.java:433)
      	at org.nuxeo.elasticsearch.ElasticSearchComponent.source(ElasticSearchComponent.java:514)
      	at org.nuxeo.elasticsearch.bulk.IndexRequestComputation.compute(IndexRequestComputation.java:92)
      ....
      "s3-transfer-manager-worker-9" #324 prio=5 os_prio=0 tid=0x00007fbbe001d000 nid=0x237 runnable [0x00007fbaff778000]
         java.lang.Thread.State: RUNNABLE
      	at java.net.SocketInputStream.socketRead0(Native Method)
      	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
      	at java.net.SocketInputStream.read(SocketInputStream.java:171)
      	at java.net.SocketInputStream.read(SocketInputStream.java:141)
      	at sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:457)
      	at sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68)
      	at sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1332)
      ....
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                bdelbosc Benoit Delbosc
                Reporter:
                bdelbosc Benoit Delbosc
                Participants:
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  PagerDuty

                  Error rendering 'com.pagerduty.jira-server-plugin:PagerDuty'. Please contact your Jira administrators.