我试图使用Spark在文本文件中找到最常用的单词。为此,我将文件拆分为键值(字,计数)对。但是,当我尝试使用maxBy函数时,我收到以下错误:
scala> val maxKey=mapped.maxBy(_._2)
<console>:21: error: value maxBy is not a member of org.apache.spark.rdd.RDD[(String, Int)]
val maxKey=mapped.maxBy(_._2)
^
映射的RDD包含以下数据:
scala> mapped.collect()
res8: Array[(String, Int)] = Array((University,1), (play,1), (this,1), (is,1), (Sagar,1), (meaningful,1), (26,1), (badminton,1), (years,1), (Arizona,1), (Kalburgi,1), (old,1), (State,1), (I,3), (to,1), (at,1), (like,2), (watching,1), (I'm,1), (tennis,1), (study,1), (Hi,1), (and,1), (movies,1))
Stack Overflow上的This问题解决了同样的问题,但答案中给出的解决方法并不清楚