Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-30368

Take into account fulltext maxSize when computing text in FulltextExtractorWork

    XMLWordPrintable

    Details

      Description

      We have cases where the fulltext extraction computation could fail due to OOM, in this part of code:

                  List<String> strings = new ArrayList<>();
                  for (Blob blob : blobsExtractor.getBlobs(document)) {
                      String string = blobsText.computeIfAbsent(blob, this::blobToText);
                      strings.add(string);
                  }
                  // add space at beginning and end for simulated phrase search using LIKE "% foo bar %"
                  String text = " " + String.join(" ", strings) + " ";
      

      Just after this computation, we limit the text length in order to be able to store the value in the DB.

      In order to avoid OOM, we should take into account the maxSize when computing this field instead of computing globally and then limiting it.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: