Question

val spark = SparkSession.builder().appName("Wordcount").master("local[*]").getOrCreate()
val textf = spark.read.textFile("in/fruits.txt")
import spark.implicits._
val textf2 = textf.flatMap( x => x.split(" ") )
val textf3 = textf2.filter ( x => x.length > 0)
val textf4 = textf3.map( x => (x,1))  // I get only reduce() function

为什么reduceByKey不可用？。

Answer 1

使用spark会话读取时，您正在使用Dataset [String]。方法reduceByKey在DataSet上不可用，但在RDD上可用。您可以尝试以下方法：

textf4.map( x => (x,1)).rdd.reduceByKey(...)

为什么在使用Spark会话时无法使用reduceByKey（）函数

1 个答案: