Calculating Average by Message

Comments

2 comments

  • Avatar
    Christian

    hi Kelly. let's tease this apart a bit:

    _collector=dsjobs "emrusage" 
    | json auto "fulla.cost", "pyspark.cost","followrecommendations.cost","sqoop.cost" as fullacost, pysparkcost,followcost,sqoopcost nodrop
    | (fullacost + pysparkcost + followcost + sqoopcost) as totalcost
    | avg(fullacost) as fullaavg, 
      avg(pysparkcost) as pysparkavg, 
      avg(followcost) by Time as followavg,
      avg(sqoop) as sqoopavg,
      avg(totalcost) as avgtotalcost
    

    the problem with this query is this part:

      avg(followcost) by Time as followavg,
    

    the "by" keyword has to come after all the aggregation specifications, plus I am not sure where "time" comes from here. so in order to get the query working, try this:

    _collector=dsjobs "emrusage" 
    | json auto "fulla.cost", "pyspark.cost","followrecommendations.cost","sqoop.cost" as fullacost, pysparkcost,followcost,sqoopcost nodrop
    | (fullacost + pysparkcost + followcost + sqoopcost) as totalcost
    | avg(fullacost) as fullaavg, 
      avg(pysparkcost) as pysparkavg, 
      avg(followcost) as followavg,
      avg(sqoop) as sqoopavg,
      avg(totalcost) as avgtotalcost
    

    if you want to see the averages by say minute, try this:

    _collector=dsjobs "emrusage" 
    | json auto "fulla.cost", "pyspark.cost","followrecommendations.cost","sqoop.cost" as fullacost, pysparkcost,followcost,sqoopcost nodrop
    | (fullacost + pysparkcost + followcost + sqoopcost) as totalcost
    | timeslice 1m
    | avg(fullacost) as fullaavg, 
      avg(pysparkcost) as pysparkavg, 
      avg(followcost) as followavg,
      avg(sqoop) as sqoopavg,
      avg(totalcost) as avgtotalcost
      by _timeslice
    

    let us know if that helps.

    chr.

    0
    Comment actions Permalink
  • Avatar
    Kelly Burdine

    Thanks! This seems to have solved the problem.

    0
    Comment actions Permalink

Please sign in to leave a comment.