Uploaded image for project: 'Nuxeo Platform'
  1. Nuxeo Platform
  2. NXP-30100

Nuxeo health check should trace failure

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 10.10
    • Fix Version/s: 10.10-HF43, 11.x, 2021.2
    • Component/s: Monitoring
    • Release Notes Summary:
      Nuxeo health check traces failures.
    • Upgrade notes:
      Hide

      In order to not pollute the log if failures happen for all /runningstatus check, you should put this logger in your log4j2 configuration:

          <Logger name="org.nuxeo.ecm.core.management.statuses.HealthCheckResult">
            <!-- this filter allows to print 1 log (maxBurst) every minute (rate) -->
            <BurstFilter level="warn" rate="0.0166" maxBurst="1"/>
          </Logger>
      
      Show
      In order to not pollute the log if failures happen for all /runningstatus check, you should put this logger in your log4j2 configuration: <Logger name= "org.nuxeo.ecm.core.management.statuses.HealthCheckResult" > <!-- this filter allows to print 1 log (maxBurst) every minute (rate) --> <BurstFilter level= "warn" rate= "0.0166" maxBurst= "1" /> </Logger>
    • Team:
      PLATFORM
    • Sprint:
      nxplatform #28, nxplatform #29
    • Story Points:
      3

      Description

      Today the health check is mainly used by ALB to direct traffic to healthy front nodes.
      If the running status endpoint returns an error the ALB blacklists the node but we have no way to know for which reason in the postmortem.

      An unhealthy response from the running status endpoint should be traced in the log at WARN (ERROR?) level, the tracing should be smart enough to trace this at a very low frequency (~every minute).

        Attachments

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 0 minutes
                0m
                Remaining:
                Remaining Estimate - 0 minutes
                0m
                Logged:
                Time Spent - 5 hours
                5h

                  PagerDuty

                  Error rendering 'com.pagerduty.jira-server-plugin:PagerDuty'. Please contact your Jira administrators.