Parsing Heterogeneous JSON Streams
Hi folks,
I'm trying and failing to parse JSON coming from two sources, because the JSON schema is not consistent between the two sources. For a toy example, here are the two schemas:
// First source:
{ "_m": "XYZ completed" }
// Second source:
{ "_m": "received message", "received": { "key": "XYZ" } }
I want to be able to parse these two message formats, and aggregate the number of "XYZ completed" messages and the number of "received message" messages where received.key = "XYZ", and take a ratio of the two. However, I haven't been able to figure out how to construct my query to do this. I've tried something like this:
_sourceName=source1 or _sourceName=source2 | json auto | where _m = "XYZ completed" or _m = "received message" | json field=_raw "received.key"
At this point, I've collected the two streams and pulled out the "key" field, but I don't know what sort operator I'd use to count these two metrics.
-
I think what you'll need is to use the 'nodrop' operator at the end of your last json parsing statement to ensure that you're not dropping logs that don't contain this key. Without this nodrop clause, our parsers act as another way to filter data.
_sourceName=source1 or _sourceName=source2
| json auto
| where _m = "XYZ completed" or _m = "received message"
| json field=_raw "received.key" nodrop
| if(%"received.key"="XYZ",1,0) as count_received
| if(_m="XYZ completed",1,0) as count_completed
| sum(count_received) as count_received, sum(count_completed) as count_completed
Something like the above might work.
Please sign in to leave a comment.
Comments
1 comment