Not all Alerts are created equal. For example, an alert that just looks for static thresholds might yield "false positives", alerts that don't really require action. Dynamic thresholds allow you to see the big picture and measure relative to your context. The example below, instead of just measuring counts of 404 status codes, compares the rise of 404s to that of 200s and determines if this ratio is out of the ordinary.
_sourceCategory=Apache/Access (status_code=200 or status_code=404)
| timeslice 1m
| if (status_code matches "2*", 1, 0) as successes
| if (status_code matches "4*", 1, 0) as fails
| sum(successes) as success_cnt, sum(fails) as fail_cnt by _timeslice
| fail_cnt/success_cnt as failure_rate
| sort _timeslice desc
| outlier failure_rate window=5, threshold=3, consecutive=1, direction=+
// | where failure_rate_indicator > 0
Please sign in to leave a comment.