Extracting multiple values from JSON arrays
Hi, I have a series of test executions compiled into a single JSON message every hour. I'm trying to extract the execution status of each test with the test's associated identifiers. My current query looks like this:
_sourceCategory="xxxx" and _collector="xxxx"
| json "executions[*].cycleId" as cycleId
| json "executions[*].versionId" as versionId
| json "executions[*].status.id" as status
| extract field=cycleId"(?<cycle>\d\d\d\d\d\d)" multi
| extract field=versionId"(?<version>\d\d\d\d\d)" multi
This should only generate 20 messages, as there are only 20 executions. However, it generates 400 messages instead. Is there a reason why it associates 20 versionIds for each cycleId? How can I generate only 20 messages, each populated with the corresponding cycleId and versionId? Can I also add the corresponding status to each of the 20 messages as well? Thanks!
-
Hi Brian,
I think the issue here is you provide two separate multi parsing operations. The first parses 20 values for the cycleID, then the second parses an additional 20 values for each of the 20 cycle IDs generated by the first parsing operation, thus giving you 400 messages in the end.
What you can try to do is combine these two parsing operations into one using a comma-separated list. Note you'll need to use the "parse regex" operation in place of extract and the secondary parses just reference "regex"._sourceCategory="xxxx" and _collector="xxxx"
| json "executions[*].cycleId" as cycleId
| json "executions[*].versionId" as versionId
| json "executions[*].status.id" as status
| parse regex field=cycleId"(?<cycle>\d\d\d\d\d\d)", regex field=versionId"(?<version>\d\d\d\d\d)" multi
Hope this works for you.
-
Hi Kevin Keech,
It seems I celebrated too soon. While your solution did reduce the amount of duplication I was getting, I still dont have my desired results. I am now getting all values of the version Ids extract for a single cycle Id: i.e. I have 20 cycle Ids and 20 version Ids, and now I have all 20 version Ids matched to 1 cycle Id. Do you have another solution?
Thanks.
-
Hi Brian,
Thanks for the example. (PS I saw your related case but going to answer here to try and help the broader audience)So this is actually going to require a slightly different set of parsing operations. The problem with the existing query is that once the objects and versions are parsed into separate arrays using the JSON operation there is no way to link back the individual object/version/status with subsequent parsing. This is a limitation of this operator.
Here is the query I would use in place of what you had originally, which uses just a single parse regex operation. I gave this a try with your sample message and it appears to parse each "status ID", versionID, cycleID into 5 lines._sourceCategory="xxxx" and _collector="xxxx"
| parse regex "\"status\".*?\"id\": (?<status_id>\d+).*?\"versionId\": (?<version>\d+).*?cycleId\": (?<cycleid>\d+?)," multi
Please sign in to leave a comment.
Comments
6 comments