Spark如何在foreachRDD中获取上下文

时间:2016-10-20 16:13:44

标签: apache-spark streaming rdd

我如何使用" ssc.sparkContext()"在火花流的foreachRDD?

如果我在foreachRDD(JAVA)中使用"ssc.sparkContext()"(基本上类似于ssc.sparkContext().broadcast(map)),,那么我得到"任务不可序列化"错误。

如果我使用"(new JavaSparkContext(rdd.context())).broadcast(map)",则没有问题。

因此,"ssc.sparkContext()"基本上等同于"(new JavaSparkContext(rdd.context()))"?

如果我使用"(new JavaSparkContext(rdd.context())).broadcast(map)",广播变量即关联" map"在SparkContext中分发给所有执行者。

代码如下: 这里," bcv.broadcastVar =(new JavaSparkContext(rdd.context()))。broadcast(map);"但是" bcv.broadcastVar = ssc.sparkContext.broadcast(map);"不起作用

            words.foreachRDD(new Function<JavaRDD<String>, Void>() {
                    @Override
                    public Void call(JavaRDD<String> rdd) throws Exception {
                            if (rdd != null) {
                                    System.out.println("Hello World - words - SSC !!!"); // Gets printed on Driver
                                    if (stat.data_changed == 1) {
                                            stat.data_changed = 0;
                                            bcv.broadcastVar.unpersist(); // Unpersist BC variable
                                            bcv.broadcastVar = (new JavaSparkContext(rdd.context())).broadcast(map); // Re-broadcast same BC variable with NEW data
                                    }
                            }

                            rdd.foreachPartition(new VoidFunction<Iterator<String>>() {
                                    @Override
                                    public void call(Iterator<String> items) throws Exception {
                                            System.out.println("words.foreachRDD.foreachPartition: CALLED ..."); // Gets called on Worker/Executor
                                            Integer index = 1;
                                            String lastKey = "";
                                            Integer lastValue = 0;
                                            while (true) {
                                                    String key = "A" + Long.toString(index);
                                                    Integer value = bcv.broadcastVar.value().get(key); // Executor Consumes map
                                                    if (value == null) break;
                                                    lastKey = key;
                                                    lastValue = value;
                                                    index++;
                                            }
                                            System.out.println("Executor BC: key/value: " + lastKey + " = " + lastValue);
                                            return;
                                    }
                            });

                            return null;
                    }
            });

0 个答案:

没有答案