Calculating Average by Message
I have log files that contain a json. I wanted to be able to plot one of the raw values "cost" to a graph, but you can only graph aggregated fields. I was told by the sumo logic support team that I can do a trick to graph the raw values by doing the following: avg(field_name) by _Time. This does work. However, I also need to get the averages for several fields in the log so I need to rename each one using "as new_name". Once I do that the by _Time no longer works. Does anyone know what I can do to fix this?
Here is my search:
_collector= ds_jobs "emr_usage" |json auto "fulla.cost", "pyspark.cost","follow_recommendations.cost","sqoop.cost" as fulla_cost, pyspark_cost,follow_cost,sqoop_cost nodrop |(fulla_cost +pyspark_cost+follow_cost +sqoop_cost) as total_cost|avg(fulla_cost) as fulla_avg, avg(pyspark_cost) as pyspark_avg, avg(follow_cost) by _Time as follow_avg,avg(sqoop) as sqoop_avg,avg(total_cost) as avg_total_cost
Here is my error:
Parse error: unexpected token `a' found
-
hi Kelly. let's tease this apart a bit:
_collector=dsjobs "emrusage" | json auto "fulla.cost", "pyspark.cost","followrecommendations.cost","sqoop.cost" as fullacost, pysparkcost,followcost,sqoopcost nodrop | (fullacost + pysparkcost + followcost + sqoopcost) as totalcost | avg(fullacost) as fullaavg, avg(pysparkcost) as pysparkavg, avg(followcost) by Time as followavg, avg(sqoop) as sqoopavg, avg(totalcost) as avgtotalcost
the problem with this query is this part:
avg(followcost) by Time as followavg,
the "by" keyword has to come after all the aggregation specifications, plus I am not sure where "time" comes from here. so in order to get the query working, try this:
_collector=dsjobs "emrusage" | json auto "fulla.cost", "pyspark.cost","followrecommendations.cost","sqoop.cost" as fullacost, pysparkcost,followcost,sqoopcost nodrop | (fullacost + pysparkcost + followcost + sqoopcost) as totalcost | avg(fullacost) as fullaavg, avg(pysparkcost) as pysparkavg, avg(followcost) as followavg, avg(sqoop) as sqoopavg, avg(totalcost) as avgtotalcost
if you want to see the averages by say minute, try this:
_collector=dsjobs "emrusage" | json auto "fulla.cost", "pyspark.cost","followrecommendations.cost","sqoop.cost" as fullacost, pysparkcost,followcost,sqoopcost nodrop | (fullacost + pysparkcost + followcost + sqoopcost) as totalcost | timeslice 1m | avg(fullacost) as fullaavg, avg(pysparkcost) as pysparkavg, avg(followcost) as followavg, avg(sqoop) as sqoopavg, avg(totalcost) as avgtotalcost by _timeslice
let us know if that helps.
chr.
Please sign in to leave a comment.
Comments
2 comments