如何在SparkSteaming中进行二次排序

时间:2016-09-01 12:07:15

标签: apache-spark spark-streaming

我正在使用here提到的方式在Spark-Streaming中进行二次排序。但它给出了以下错误:

repartitionAndSortWithinPartitions is not a member of org.apache.spark.streaming.dstream.DStream

代码:

def ProcessDStream(lines : DStream[EventData]) {            
            val dataSetrawSorted = lines.repartitionAndSortWithinPartitions(new DataSetPartitioner(1000))
            }

那么,如何在Dstream中实现相同目标。

1 个答案:

答案 0 :(得分:0)

使用transform:

stream.transform { rdd => rdd.repartitionAndSortWithinPartitions(...) }