Parsing delimited logs that are not constant in length?
I have a system that scans our workstations and reports back which drives an admin can access and what permission level that admin has to the drives.
Some workstations return only one or two drives. So it would look something like this:
- MACHINE_NAME: pc_monty, DRIVES: C - No Access, print - Read Only
Other workstations have many, many drives and it looks something like this:
- MACHINE_NAME: pc_python, DRIVES: C - Read Only, spam - Read Write, F - No Access, SillyWalks - Read only, ...97 drives later... Z - No Access
Normally could use something like
- | split _raw delim=',' extract 1 as drive1, 2 as drive2, 3 as drive3
I could also try
- | parse field=results "*/t*/t*/t" as drive1, drive2, drive3
But I find this isn't quite getting the desired result because the fields vary in length. How can I extract these items when the number of items I want to extract is not constant?
-
Hi Luis,
I would look at using a parse regex along with the "multi" flag for this use case. Using your example, this would look something like the following.| parse regex "MACHINE_NAME: (?<machine_name>.*), DRIVES: (?<drive_list>.*)"
| parse regex field=drive_list "(?<drive>.*?) - (?<permissions>.*?)(?:,|$)" multi
| count by machine_name, drive, permissionsThe first parsing will get the machine name and list of drives from the message. The second parsing then parses the drive list field and creates additional lines for each drive/permission pair in the list. You can then use an aggregate function (count in the above) to display the different machine/drive/permission combos.
I hope this might help with your use case.
Please sign in to leave a comment.
Comments
1 comment