显示.count(),但是.sum()出错了,我该怎么办?
代码:
def meanTemperature(df,spark):
counttemp=spark.sql("SELECT temperature from washing").count()
sumtemp=spark.sql("SELECT temperature from washing").sum()
mean=sumtemp/counttemp
return mean
错误: AttributeError:“ DataFrame”对象没有属性“ sum”
答案 0 :(得分:0)
sum()函数在DataFrame中不可用,因此会出现错误。您可以使用以下代码段查找平均值或中位数。
meanTemp = spark.sql("select mean(temperature,0.5) from washing")
return meanTemp.collect()[0][0]
如果要中值
medianTemp = spark.sql("select percentile_approx(temperature,0.5) from washing")
return medianTemp.collect()[0][0]