Dealing with null values
I'm running a query that includes adding the values from several fields to create a "total_cost". I started adding values from fields and everything was working well until I added one field that sometimes contains no value for a particular time slice. I get the error: cannot process null. Is there something I can do so that it treats the null value as a 0 so that it can still calculate the total?
Here is my query: sqoop_cost & follow_cost are the fields that sometimes contain null values.
_collector= ds_jobs "emr_usage" |json auto "fulla.cost", "pyspark.cost", "follow_recommendations.cost", "sqoop.cost" as fulla_cost, pyspark_cost, follow_cost, sqoop_cost nodrop |(fulla_cost + pyspark_cost + follow_cost +sqoop_cost) as total_cost |timeslice 10m |avg(fulla_cost) as fulla_avg, avg(pyspark_cost) as pyspark_avg, avg(follow_cost) as follow_avg, avg(sqoop_cost) as sqoop_avg, avg(total_cost) as total by timeslice
-
Hi Kelly,
You can use the if operator with the isNull Operator to replace those values. Try running this:
collector= dsjobs "emrusage"
|json auto "fulla.cost", "pyspark.cost", "followrecommendations.cost", "sqoop.cost" as fullacost, pysparkcost, followcost, sqoopcost nodrop
| if(isNull(sqoopcost), "0", sqoopcost) as sqoopcost
| if(isNull(follow_cost), "0", follow_cost) as follow_cost
| (fullacost + pysparkcost + followcost +sqoopcost) as totalcost |timeslice 10m
|avg(fullacost) as fullaavg, avg(pysparkcost) as pysparkavg, avg(followcost) as followavg, avg(sqoopcost) as sqoopavg, avg(totalcost) as total by timesliceDoes this accomplish what you're looking for?
Best,
Derek
Please sign in to leave a comment.
Comments
3 comments