标签: scala apache-spark
数据是元组的集合,格式为:(组,数字)
data.map(a => (a._1, (a._2, 1))) .reduceByKey((a,b) => (a._1 * b._1, a._2 + b._2)) .map(a => (a._1, pow(a._2._1, 1/a. 2._2))
作为Spark的新手-提供的代码在做什么?你能告诉我这个代码吗?