Problem:
Based on the messages ingested against the source category used in my scheduled search "A B C", I should have received an alert.
Cause:
This can be caused by timestamp, timezone issues, latency of ingestion, or count incorrectly configured.
Resolution:
1.) Email sent bounced
Email sent by Sumo Logic but not received by the recipients of the email address for reasons such as the email address was meant for internal distribution only or email address was on the AWS Suppression list and therefore could not be delivered.
Run a query against the Audit Index if an alert was triggered for the given schedule search
_index=sumologic_audit _sourcecategory=scheduled_search triggered "<My Alert Name>"
If the above query against the Audit Index shows that an alert was triggered for the given schedule search and the email was not received, then please investigate potential causes for email blockage
2.) Timestamp parsing or timezone issue:
Run the query for the scheduled search in the search tab with "Use Receipt time" checked.
Observe the message time and the receipt time for the messages that should have triggered the alert and also the timestamp inside the message. If you find a gap of hours between the message time and receipt time, you would suspect a default timezone configuration setting issue for the collector and/or source.
Following is an example: You can see the message is ingested or received by Sumo Logic at 11:41 CDT (Central time or UTC-05:00) but the message time assigned is 06:41 CDT. This implies the default collector or source time zone was assigned to the UTC timezone. This can be either corrected by editing the collector and/or source timezone to Central time using DST (America/Chicago for example) or by specifying a custom timestamp to the source which requires the timezone to be read from the message.
3.) Latency in ingestion of message:
If the gap between the message time and receipt time is of the order of say 5-15 minutes as could be the case of ingestion of S3 objects or cloudtrail data. In the following example, a 15 minute scheduled search executed for time range 7:30 to 07:45 did not see any data because the data for the time range ingested after the scheduled search executed.
One solution is to specify an offset to the time range of the query to account for the ingestion delay. If, for example, the ingestion delay was 15 minutes and the time range specified was -15m, the solution would update the time range it to -30m -15m to allow the scheduled search process 15-minute older data.
A second solution is to specify "Use Receipt time" in scheduled search configuration so that the query in the scheduled search will evaluate messages received or ingested in the specified time range. PLEASE NOTE this option will work with scheduled searches with a run frequency of 15 minutes or more and NOT with real-time alerts since searching with receipt time is not supported in real-time alerts.
4.) Count incorrectly configured
I have an alert configured to trigger when the count of a condition reaches a certain value, however, I am not receiving any alerts. My query is:
_sourceCategory=aws/prod
| json "message","logStream","logGroup"
| parse field=message "* * * * * * * * * * * * * *" as version,accountID,interfaceID,src_ip,dest_ip,src_port,dest_port,Protocol,Packets,bytes,StartSample,EndSample,Action,status
| timeslice 1m
| where action="REJECT"
| count as drops by _timeslice
The threshold set for my Scheduled Search alert is:
Greater than > 1000
This query returns results where the "drops" count is more than 1000, so why am I not receiving my alerts?
The answer is: thresholds set within a Scheduled Search are based on the number of result rows returned with a query and do not consider any values that may be present within a column of those rows. If your query does not perform any aggregations the Scheduled Search threshold will apply to the number of raw messages returned with a query, as seen under the Messages tab of the search. If a query contains an aggregate operation, for example, count, sum, min, max, etc... the Scheduled Search threshold will be applied to the number of aggregate rows returned by the query, as seen within the Aggregate tab of the results.
When performing an aggregation as part of a query, and wanting to alert when a specific aggregate value meets a threshold, the threshold for that field value will need to be included as part of the query itself. This can typically be done by providing a Where condition after the aggregation within the query. For example:
_sourceCategory=aws/prod
| json "message","logStream","logGroup"
| parse field=message "* * * * * * * * * * * * * *" as version,accountID,interfaceID,src_ip,dest_ip,src_port,dest_port,Protocol,Packets,bytes,StartSample,EndSample,Action,status
| timeslice 1m
| where action="REJECT"
| count as drops by _timeslice
| where drops > 1000
This will cause there to only be a results row returned where the field value meets the threshold provided within the query. The threshold set within the Scheduled Search would then be set to alert based on the resulting number of rows that met the threshold set within the query. For example:
Greater than > 0
Comments
0 comments
Please sign in to leave a comment.