Data Ingestion: Rolling Standard Deviation, Accumulation, Backshift, etc.
Wanted to share an example query that I put together for a customer recently. The request was for a search that will look at the amount of data ingested, per hour by a single collector. He wanted to set up alerts if it seemed to be lagging or spiking. This is what I came up with.
* _index=sumologic_volume|where _sourceCategory="collector_volume"
| parse regex "\"(?<collector>(?:[^\"]+)|(?:\"\"))\":{\"sizeInBytes\":(?<bytes>\d+),\"count\":(?<count>\d+)}" multi
| where collector matches "collector name here"
| timeslice 1h
| bytes/1024/1024 as mbytes| sum(mbytes) as mbytes by collector, _timeslice
| sort + _timeslice
| backshift mbytes, 1 as prevMB
| smooth prevMB,24 as movingAVG
| rollingstd mbytes, 24 as rollingStd
| movingAvg + (3 * rollingStd) as upper
| movingAvg - (3 * rollingStd) as lower
| accum mbytes as totalMB
Few things to note:
- This is calculating in MB, not GB. You'll need to add another "/1024" and do some renaming if you want to see this in GB.
- You need to enable the data volume index
Drop that last accum statement (comment the line out with // ) and it makes a nice visualization on a line or area chart.
Add some where clauses and this could easily be used for alerting purposes.
- for example: where rollingStd > 1
Please comment with any questions or suggestions
Please sign in to leave a comment.
Comments
2 comments