Question

我的代码在下面，但是会在标题中产生错误。谁能解释发生了什么事？

val firstFileFlatten = scala.io.Source
    .fromFile(firstFile)
    .getLines
    .flatMap(_.split("\\W+"))
    .toList

val filteredWordsFirstFile = firstFileFlatten
    .filter(!stopWords.contains(_))

val mapreduceFirstFile = filteredWordsFirstFile
    .map(word => (word, 1))
    .reduceByKey((v1,v2) => v1 + v2)

Answer 1

reduceByKey在普通Scala中不存在。有关更多详细信息，请参见this issue on github。

工作环境就是这样：

listOfPairs
    .groupBy(_._1)
    .map{ case (key, list) => key -> list.map(_._2).reduce(_+_) }

Answer 2

您使用标准的Scala集合，而不使用RDD。那里没有方法reduceByKey，请使用reduce。或通过Spark使用RDD，则可以使用reduceByKey。

https://dzone.com/articles/wordcount-with-spark-and-scala

https://www.scala-lang.org/api/2.12.8/scala/collection/immutable/List.html

值reduceByKey不是List [（String，Int）]的成员

2 个答案: