The initial step for investigating a data volume spike is to review the data volumes for your Collectors and Sources. This should reveal if the spike is occurring in an existing Collector or Source or from a new data source(s) being ingested.
If the Data Volume Index was enabled prior to the data volume spike, please run the following query to create a timeline of the various sources and the volume for each. This and several other searches can be found in the Data Volume app and log ingest data volume ingest
_index=sumologic_volume
| where _sourceCategory="sourcename_volume"
| parse regex "(?<sourcename>\"[^\"]+\")\:\{\"sizeInBytes\"\:(?<bytes>\d+),\"count\"\:(?<count>\d+)\}" multi
| timeslice 1h
| bytes/1024/1024/1024 as gbytes| sum(gbytes) as gbytes by sourcename, _timeslice
| transpose row _timeslice column sourcename
If the Data Volume Index was not enabled prior to the data volume spike, please run the following query:
*
| timeslice by 1h
| count by _source,_timeslice
| transpose row _timeslice column _source
The examples above are based on an investigation of a 24-hour period. Extend the time range and timeslice as needed to understand when and where your spike occurred.
What if this does not align with what you expect based on the file sizes you see on the server? In some cases, there could be duplicate messages. The following query can be used to detect this:
_source=<nameofsource>
| count by _raw
| where _count > 1
There are a couple of common causes of duplicate message ingestion.
- A path change (particularly by a UNC file path change within the network).
- Log files that were copied to a new directory that was also being ingested to the Sumo Logic service.
Once you have determined the root cause of the path change(s) double-check the sources and look for the field “Collection should begin at.” Setting this to a value like 24 hours instead of “All Time” can limit the impact of path changes and multiple message ingestion.
Comments
0 comments
Please sign in to leave a comment.