Uploaded image for project: 'Nuxeo ECM Build/Test Environment'
  1. Nuxeo ECM Build/Test Environment
  2. NXBT-2599

Repair Docker swarm and Consul (defunct processes)

    XMLWordPrintable

    Details

      Description

      Original ticket title: Analyze NoHttpResponseException at maven.nuxeo.org

      Maven issues were actually due to zombie processes leaked by https://hub.docker.com/_/progrium/consul/

       1282 ?        Ss     6:09 runsvdir -P /etc/service log: 26152b64607b16b873eb6b3cecb44c09bd677a27ced4fe20ee853308. Stop the container before attempting removal or use -f docker: Error response from daemon: Conflict. The container name "/swarm-manager" is already in use by container 025e290a26152b64607b16b873eb6b3cecb44c09bd677a27ced4fe20ee853308. You have to remove (or rename) that container to be able to reuse that name.. See 'docker run --help'. 
      
      19021 ?        Ssl  418:52  |   |   \_ /bin/consul agent -config-dir=/config -server -bootstrap
      24011 ?        Ss     0:00  |   |       \_ -bash       
      28266 ?        Zs     0:37  |   |       \_ [-bash] <defunct>
      29297 ?        Zs   41724:54  |   |       \_ [iN5Pqv] <defunct>
       1453 ?        Z      0:00  |   |       \_ [openssl] <defunct>
      
      e37a154b2ef5        progrium/consul     "/bin/start -serve..."   5 months ago        Up 5 months         53/tcp, 53/udp, 8300-8302/tcp, 8400/tcp, 8301-8302/udp, 0.0.0.0:8500->8500/tcp   consul
      

      The Docker swarm was also redirecting everything to qa-ovh02:

       qa-ovh01: 51.254.42.78:4243
        └ Status: Pending
       qa-ovh02.nuxeo.com: 51.254.197.210:4243
        └ Status: Healthy
       qa-ovh03: 151.80.31.37:4243
        └ Status: Pending

      Anahide Tchertchian

      a series of builds failed because "java.io.IOException: Backing channel is disconnected."
      https://qa.nuxeo.org/jenkins/job/master/job/addons_nuxeo-liveconnect-master/1420/
      https://qa.nuxeo.org/jenkins/job/master/job/addons_nuxeo-audit-storage-directory-master/535/
      https://qa.nuxeo.org/jenkins/job/master/job/addons_nuxeo-diff-master/1103/

      Then https://qa.nuxeo.org/jenkins/job/master/job/addons_nuxeo-diff-master/1104/ is back to blue.

      It is a new error that all happened at the exact same time: 13:13.

      13:13:20 [INFO] I/O exception (org.apache.maven.wagon.providers.http.httpclient.NoHttpResponseException) caught when processing request to {}->http://maven.nuxeo.org:80: The target server failed to respond
      13:13:20 [INFO] Retrying request to {}->http://maven.nuxeo.org:80
      (...)
      13:13:22 [INFO] Downloading: http://maven.nuxeo.org/nexus/content/groups/public/junit/junit/4.12/junit-4.12.pom
      13:13:24 ERROR: Failed to parse POMs
      13:13:24 java.io.IOException: Backing channel is disconnected.
      13:13:24 	at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:184)
      13:13:24 	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:249)
      13:13:24 	at com.sun.proxy.$Proxy108.isAlive(Unknown Source)
      13:13:24 	at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:996)
      13:13:24 	at hudson.maven.ProcessCache$MavenProcess.call(ProcessCache.java:166)
      13:13:24 	at hudson.maven.MavenModuleSetBuild$MavenModuleSetBuildExecution.doRun(MavenModuleSetBuild.java:873)
      13:13:24 	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)
      13:13:24 	at hudson.model.Run.execute(Run.java:1738)
      13:13:24 	at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:544)
      13:13:24 	at hudson.model.ResourceController.execute(ResourceController.java:98)
      13:13:24 	at hudson.model.Executor.run(Executor.java:410)
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                jcarsique Julien Carsique
                Reporter:
                jcarsique Julien Carsique
                Participants:
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: