Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-29766

Inter-process interaction leads to launcher spinning on Windows

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 10.10
    • Fix Version/s: HOTFIX_10.10, 11.x
    • Component/s: Launcher
    • Environment:
      Windows only
    • Tags:
    • Backlog priority:
      100

      Description

      The Launcher and the Bootstrap processes are strongly interacting in a writer-reader model.
      By default, the logs files suveyed by the Launcher are rotated at midnight by Log4J2 of the Bootstrap process.
      However, on Windows, the Launcher holds a handle on the file. and this provokes the following messages:

      020-10-10 00:00:00,128 Log4j2-TF-3-Scheduled-5 ERROR Unable to delete file C:\CASES\nuxeo-server-10.10-tomcat\log\server.log: java.nio.file.FileSystemException C:\CASES\nuxeo-server-10.10-tomcat\log\server.log: Le processus ne peut pas acc├®der au fichier car ce fichier est utilis├® par un autre processus.
      
      2020-10-10 00:00:00,128 Log4j2-TF-3-Scheduled-5 ERROR Unable to overwrite file C:\CASES\nuxeo-server-10.10-tomcat\log\server.log: java.io.FileNotFoundException C:\CASES\nuxeo-server-10.10-tomcat\log\server.log (Le processus ne peut pas acc├®der au fichier car ce fichier est utilis├® par un autre processus)
      2020-10-10 00:00:00,143 Log4j2-TF-4-Scheduled-2 ERROR Unable to delete file C:\CASES\nuxeo-server-10.10-tomcat\log\console.log: java.nio.file.FileSystemException C:\CASES\nuxeo-server-10.10-tomcat\log\console.log: Le processus ne peut pas accéder au fichier car ce fichier est utilisé par un autre processus.
      
      2020-10-10 00:00:00,143 Log4j2-TF-7-Scheduled-4 ERROR Unable to delete file C:\CASES\nuxeo-server-10.10-tomcat\log\server.log: java.nio.file.FileSystemException C:\CASES\nuxeo-server-10.10-tomcat\log\server.log: Le processus ne peut pas accéder au fichier car ce fichier est utilisé par un autre processus.
      

      From this point on, the Launcher process starts spinning on:

      "NuxeoLauncherGUITask-1" #43 daemon prio=6 os_prio=0 cpu=46078093.75ms elapsed=89687.83s tid=0x000001e018c29000 nid=0x237c runnable [0x000000cdc91ff000]
      java.lang.Thread.State: RUNNABLE
      at java.io.FileInputStream.open0(java.base@11.0.7/Native Method)
      at java.io.FileInputStream.open(java.base@11.0.7/FileInputStream.java:219)
      at java.io.FileInputStream.<init>(java.base@11.0.7/FileInputStream.java:157)
      at java.io.FileReader.<init>(java.base@11.0.7/FileReader.java:75)
      at org.nuxeo.launcher.gui.logs.LogsSource.run(LogsSource.java:86)
      at java.lang.Thread.run(java.base@11.0.7/Thread.java:834)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.7/ThreadPoolExecutor.java:1128)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.7/ThreadPoolExecutor.java:628)
      at java.lang.Thread.run(java.base@11.0.7/Thread.java:834)
      

      where the 46000 seconds CPU consumption is gotten at about 12hours and 50minutes after midnight, date of the preceding messages

      In https://github.com/nuxeo/nuxeo/blob/10.10/nuxeo-distribution/nuxeo-launcher/src/main/java/org/nuxeo/launcher/gui/logs/LogsSource.java#L86 , the FileReader call happens in a while(true) loop leading to questioning the Nuxeo implementation when on Windows.

      It can eventually happen after several days that the spinning thread dies out of an unhandled exception, thus leading to an apparent resolution, while in fact there is then a loss in functionality:

      2020-10-11 00:00:00,237 Log4j2-TF-3-Scheduled-1 ERROR Unable to delete file C:\CASES\nuxeo-server-10.10-tomcat\log\server.log: java.nio.file.FileSystemException C:\CASES\nuxeo-server-10.10-tomcat\log\server.log: Le processus ne peut pas acc├®der au fichier car ce fichier est utilis├® par un autre processus.
      
      java.io.FileNotFoundException: C:\CASES\nuxeo-server-10.10-tomcat\log\console.log (Le fichier spÚcifiÚ est introuvable)
      

      (the spinning thread then disappears from the thread dump)
      This tragic outcome might be due to no messages having been logged for an entire day.

      A possible workaround is to change in lib/log4j2.xml e.g. for server.log

          <RollingFile name="FILE" fileName="${sys:nuxeo.log.dir}/server.log"
                       filePattern="${sys:nuxeo.log.dir}/server-%d{yyyy-MM-dd}.log.gz" append="true">
            <PatternLayout pattern="%d{ISO8601} %-5p [%t] [%c] %m%n" />
            <CronTriggeringPolicy schedule="0 0 0 * * ?" evaluateOnStartup="true" /> <!-- Rollover at midnight every day -->
            <DefaultRolloverStrategy />
          </RollingFile>
      

      by

          <RollingFile name="FILE" append="true" 
                    fileName="${sys:nuxeo.log.dir}/server.log" filePattern="${sys:nuxeo.log.dir}/server-%d{yyyy-MM-dd-hh-mm}.log.gz" >
             <PatternLayout pattern="%d{ISO8601} %-5p [%t] [%c] %m%n" />
             <Policies>
               <OnStartupTriggeringPolicy minSize="0"/>
             </Policies>      
          </RollingFile>
      

      which rotates the log file only on startup of the JVM.

      How to reproduce

      • start a Nuxeo 10.10 (eventually hotfixed) on Windows using nuxeoctl console and let it run without using Nuxeo
      • take a thread dump every day and watch for CPU consumption of the Java processes

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              pabgrall Patrick Abgrall
              Participants:
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: