Kafka + Java + SparkStreaming + reduceByKeyAndWindow抛出异常:org.apache.spark.SparkException:任务不可序列化

时间:2016-08-25 09:13:21

标签: java serialization spark-streaming

我是kafka和spark的新人,我试图做一些计数,但没有成功!问题的细节如下。谢谢!

代码如下:

JavaPairDStream<String,Integer> counts = wordCounts.reduceByKeyAndWindow(new AddIntegers(), new SubtractIntegers(), Durations.seconds(8000), Durations.seconds(4000));

以下例外情况:

  

线程中的异常&#34;线程-3&#34; org.apache.spark.SparkException:任务   不可序列化的   org.apache.spark.util.ClosureCleaner $ .ensureSerializable(ClosureCleaner.scala:166)     在   org.apache.spark.util.ClosureCleaner $清洁机壳(ClosureCleaner.scala:158)     在org.apache.spark.SparkContext.clean(SparkContext.scala:1623)at   org.apache.spark.streaming.dstream.PairDStreamFunctions.reduceByKeyAndWindow(PairDStreamFunctions.scala:333)     在   org.apache.spark.streaming.dstream.PairDStreamFunctions.reduceByKeyAndWindow(PairDStreamFunctions.scala:299)     在   org.apache.spark.streaming.api.java.JavaPairDStream.reduceByKeyAndWindow(JavaPairDStream.scala:352)     在KafkaAndDstreamWithIncrement.KDDConsumer.run(KDDConsumer.java:110)   引起:java.io.NotSerializableException:   KafkaAndDstreamWithIncrement.KDDConsumer

1 个答案:

答案 0 :(得分:0)

代码如下(定义静态):

static Function2<Integer,Integer,Integer> AddIntegers = new Function2<Integer,Integer,Integer>(){
    @Override
    public Integer call (Integer i1,Integer i2){
       return i1 + i2;
    }
};
static Function2<Integer,Integer,Integer> SubtractIntegers = new Function2<Integer,Integer,Integer>(){
    @Override
    public Integer call (Integer i1,Integer i2){
       return i1 - i2;
    }
};