Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-30326

Fix long-running thumbnail generation for large office files

    XMLWordPrintable

    Details

    • Release Notes Summary:
      LIbreOffice is used to render Office file thumbnails.
    • Backlog priority:
      700
    • Sprint:
      nxplatform #36, nxplatform #37
    • Story Points:
      2

      Description

      When importing large Office documents, the thumbnail generation's process consists of 2 steps:
      1. convert the office document to a PDF
      2. generate an image from the first page of the PDF

      This could be a cumbersome process for large files, a long time will be spent to convert the complete file to PDF just to extract the first page and convert it to an image. It even reaches the transaction timeout.

      This process could be improved by, at least, extract the first page in the first step.

      The soffice command-line allows to do exactly that with the following command:

      soffice --headless --norestore --writer --convert-to png large-test-ppt.ppt --outdir /tmp
      

      I built a plugin that replaces the existing anyToThumbnail with a new implementation, which runs an alternate conversion chain that directly extract an image of the first page without converting the whole document to PDF. Little changes need to be made. See my plugin here: https://github.com/vdutat/nuxeo-soffice-thumbnail-converters.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: