Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-30874

Fix cold storage and retention addons redis helm values indent

    XMLWordPrintable

    Details

      Description

      CI is typically failing with:

      [2022-02-10T19:54:34.444Z] Installed chart platform/nuxeo with name test-release into namespace nuxeo-coldstorage-10-10-dev
      
      [2022-02-10T19:54:34.989Z] + kubectl rollout status statefulset test-release-redis-master --timeout=5m --namespace=nuxeo-coldstorage-10-10-dev
      
      [2022-02-10T19:54:34.990Z] Waiting for 1 pods to be ready...
      
      [2022-02-10T19:59:41.584Z] error: timed out waiting for the condition
      

      https://jenkins.napps.dev.nuxeo.com/blue/organizations/jenkins/nuxeo%2Fnuxeo-coldstorage/detail/10.10/19/pipeline/

      Looking at the pod during the execution:

      kubectl -n nuxeo-coldstorage-pr-189-dev describe pod test-release-redis-master-0
      

      We can see the following events:

      Events:
        Type     Reason             Age                  From                Message
        ----     ------             ----                 ----                -------
        Normal   NotTriggerScaleUp  3m27s                cluster-autoscaler  pod didn't trigger scale-up: 1 node(s) had taint {team: napps}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: nodes-startup}, that the pod didn't tolerate, 1 max node group size reached, 1 node(s) had taint {team: nos}, that the pod didn't tolerate, 1 node(s) had taint {team: platform}, that the pod didn't tolerate, 1 node(s) had taint {team: ui}, that the pod didn't tolerate
        Warning  FailedScheduling   46s (x5 over 3m29s)  default-scheduler   0/28 nodes are available: 10 node(s) had taint {team: ai}, that the pod didn't tolerate, 2 node(s) had taint {team: ui}, that the pod didn't tolerate, 5 node(s) had taint {team: napps}, that the pod didn't tolerate, 5 node(s) had taint {team: platform}, that the pod didn't tolerate, 6 node(s) had taint {dedicated: nodes-startup}, that the pod didn't tolerate.
      

      It is because our nuxeo-test-base-values.yaml has a bad indentation and we do not properly override the toleration value from https://github.com/bitnami/charts/blob/master/bitnami/redis/values.yaml

      I am still not sure yet why it is just causing an issue now and not before

        Attachments

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: