Spark异步作业失败并出现错误

时间:2017-09-20 16:09:52

标签: apache-spark asynchronous spark-submit

我正在编写java中的spark代码。当我使用edges <- read.table(text=" n N1 N2 Value 1 off off.1 0.77149877 2 weak off.1 0.47474747 3 medium off.1 0.17537313 4 strong off.1 0.03603604 5 off low.1 0.06879607 6 weak low.1 0.26262626 7 medium low.1 0.13432836 8 strong low.1 0.01351351 9 off medium.1 0.12530713 10 weak medium.1 0.23232323 11 medium medium.1 0.54850746 12 strong medium.1 0.20720721 13 off strong.1 0.03439803 14 weak strong.1 0.03030303 15 medium strong.1 0.14179104 16 strong strong.1 0.74324324 ", header=T) nodes <- read.table(text=" n ID x y 1 off 1 4 2 weak 1 3 3 medium 1 2 4 strong 1 1 5 off.1 2 4 6 low.1 2 3 7 medium.1 2 2 8 strong.1 2 1 ", header=T) nodes$ID <- as.character(nodes$ID) edges$N1 <- as.numeric(factor(edges$N1, levels=c("off","weak","medium","strong")))-1 edges$N2 <- as.numeric(factor(edges$N2, levels=c("off.1","low.1","medium.1","strong.1")))+3 library(networkD3) sankeyNetwork(Links = edges, Nodes = nodes, Source = "N1", Target = "N2", Value = "Value", NodeID = "ID", iterations=0, units = "", fontSize = 12, nodeWidth = 30) 火花失败并给我foreachAsync

在此代码中:

java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext.

工作正常。但是在这段代码中:

JavaSparkContext sparkContext = new JavaSparkContext("local","MyAppName");
    JavaPairRDD<String, String> wholeTextFiles = sparkContext.wholeTextFiles("somePath");
    wholeTextFiles.foreach(new VoidFunction<Tuple2<String, String>>() {
        public void call(Tuple2<String, String> stringStringTuple2) throws Exception {
            //do something
        }
    });

它返回错误。哪里我错了?

1 个答案:

答案 0 :(得分:3)

因为foreachAsync返回一个Future对象,当你离开一个函数时,spark上下文被关闭(因为它是在本地创建的)。

如果您在get()上致电foreachAsync(),那么主线程将等待Future完成。