Uploaded image for project: 'Nuxeo ECM Build/Test Environment'
  1. Nuxeo ECM Build/Test Environment
  2. NXBT-2719

Fix swarm slave registration

    XMLWordPrintable

    Details

    • Type: Problem
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: NXP-10.x
    • Fix Version/s: NXP-10.x
    • Component/s: Continuous Integration
    • Sprint:
      DevTools-07

      Description

      1. manually start a swarm slave like this:
        docker -H tcp://swarm-qa.nuxeo.org:4000 run --restart always --privileged -itd \
        -e JENKINS_USERNAME=nuxeojenkins -e JENKINS_API_TOKEN=XXXXXX \
        -l swarm -l SLAVE1010 -l DYNAMIC --name swarm-pub-1010-0 -e JENKINS_NAME=swarm-pub-1010-0 \
        -e JENKINS_LABELS='swarm SLAVE1010 DYNAMIC ondemand1010 ondemandcheck1010' \
        -e JENKINS_MASTER=https://qa2.nuxeo.org/jenkins \
        -w /opt/jenkins -v /var/run/docker.sock:/var/run/docker.sock:rw -v /opt/jenkins/workspace:/opt/jenkins/workspace:rw \
            dockerpriv.nuxeo.com:443/nuxeo/jenkins-slave-10.10-swarm:latest
        
      2. double-check it is started:
        PASUP-tocard:~ ffischer$ docker -H tcp://swarm-qa.nuxeo.org:4000 ps -f name=1010
        CONTAINER ID        IMAGE                                                             COMMAND             CREATED             STATUS              PORTS                         NAMES
        ce9b6ee47047        dockerpriv.nuxeo.com:443/nuxeo/jenkins-slave-10.10-swarm:latest   "/sbin/my_init"     19 minutes ago      Up 19 minutes       22/tcp                        qa-ovh03/swarm-pub-1010-0
        a51bf3f2f59b        dockerpriv.nuxeo.com:443/nuxeo/jenkins-slavepriv-10.10            "/sbin/my_init"     16 hours ago        Up 15 hours         51.254.197.210:5801->22/tcp   qa-ovh02.nuxeo.com/slavepriv2-1010-1
        07f0cb0f9d4d        dockerpriv.nuxeo.com:443/nuxeo/jenkins-it-10.10                   "/sbin/my_init"     16 hours ago        Up 15 hours         51.254.197.210:2305->22/tcp   qa-ovh02.nuxeo.com/itslave1010
        5a5316f9e063        dockerpriv.nuxeo.com:443/nuxeo/jenkins-slave-10.10                "/sbin/my_init"     16 hours ago        Up 15 hours         51.254.197.210:2204->22/tcp   qa-ovh02.nuxeo.com/static1010
      3. observe in jenkins the slave is not found:
        https://qa2.nuxeo.org/jenkins/computer/
      4. login to the slave:
        docker -H tcp://swarm-qa.nuxeo.org:4000 exec -it ce9b6ee47047 bash
        PASUP-tocard:~ ffischer$ docker -H tcp://swarm-qa.nuxeo.org:4000 exec -it ce9b6ee47047 bash
        root@ce9b6ee47047:/opt/jenkins# sudo su - jenkins
        jenkins@ce9b6ee47047:~$ cd swarm-client/
        jenkins@ce9b6ee47047:~/swarm-client$ tail swarm-0.log.0 
        SEVERE: Failed to fetch slave info from Jenkins, HTTP response code: 401
        Jan 23, 2019 10:15:20 AM hudson.plugins.swarm.Client run
        SEVERE: RetryException occurred
        hudson.plugins.swarm.RetryException: Failed to fetch slave info from Jenkins, HTTP response code: 401
        	at hudson.plugins.swarm.SwarmClient.discoverFromMasterUrl(SwarmClient.java:223)
        	at hudson.plugins.swarm.Client.run(Client.java:141)
        	at hudson.plugins.swarm.Client.main(Client.java:114)
        
        Jan 23, 2019 10:15:20 AM hudson.plugins.swarm.Client run
        SEVERE: Retry limit reached, exiting...
        

        Attachments

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: