Problem:
When configuring Sumo Logic Kubernetes solution with Prometheus for metrics collection using HELM CLI command, and setting prometheus-operator.prometheus.prometheusSpec.scrapeInterval to be greater than or equal to 1m:
Ex.
helm install sumologic/sumologic --name collection --namespace sumologic
--set sumologic.accessId=<SUMO_ACCESS_ID>
--set sumologic.accessKey=<SUMO_ACCESS_KEY>
--set sumologic.clusterName="<MY_CLUSTER_NAME>"
--set prometheus-operator.prometheus.prometheusSpec.scrapeInterval=1m
An alternate method of implementing the same configuration is to use the prometheus-overrides.yaml file:
Note: This issue is specific to "node:node_cpu_utilisation:avg1m" metrics, other metrics will still work.
Cause:
The default setting for scrapeInterval is 30s as shown below.
prometheusSpec:
## Prometheus default scrape interval, default from upstream Prometheus Operator Helm chart
## NOTE changing the scrape interval to be >1m can result in metrics from recording rules to be missing and empty panels in Sumo Logic Kubernetes apps.
scrapeInterval: "30s"
When the scrape interval is increased to 1m then metric for the node:node_cpu_utilization should average at a multiple of the scrape interval.
Resolution:
Option:1
Undo the customization and set the Scrape interval back to the default of 30s.
Option: 2
For 1m scrape interval, the metrics query should be like below:
metric=node:node_cpu_utilisation:avg2m
For 2m scrape interval, the metrics query should be like below:
metric=node:node_cpu_utilisation:avg4m
Comments
0 comments
Please sign in to leave a comment.