数2个元素Spark

时间:2016-11-27 22:41:06

标签: apache-spark spark-dataframe

如何计算spark中的2列?

我测试了一下。但这不是好方法..

joinDF = logDF.join(logDF2,"day_number")
compareNumberRequestTraffic = joinDF.groupBy("day_number") \
    .agg(functions.count("request","request2")) \
    .show()

我有错误:

() takes exactly 1 argument (2 given)

我想要输出

day_number      count(request)     count(request2)
2015-01-03                5                   7

非常感谢

1 个答案:

答案 0 :(得分:1)

不要使用2个参数写入计数,但在agg函数中使用2个计数

joinDF = logDF.join(logDF2,"day_number")
compareNumberRequestTraffic = joinDF.groupBy("day_number") \
    .agg(functions.count("request"), functions.count("request2")) \
    .show()