我正在编写java中的spark代码。当我使用edges <- read.table(text="
n N1 N2 Value
1 off off.1 0.77149877
2 weak off.1 0.47474747
3 medium off.1 0.17537313
4 strong off.1 0.03603604
5 off low.1 0.06879607
6 weak low.1 0.26262626
7 medium low.1 0.13432836
8 strong low.1 0.01351351
9 off medium.1 0.12530713
10 weak medium.1 0.23232323
11 medium medium.1 0.54850746
12 strong medium.1 0.20720721
13 off strong.1 0.03439803
14 weak strong.1 0.03030303
15 medium strong.1 0.14179104
16 strong strong.1 0.74324324
", header=T)
nodes <- read.table(text="
n ID x y
1 off 1 4
2 weak 1 3
3 medium 1 2
4 strong 1 1
5 off.1 2 4
6 low.1 2 3
7 medium.1 2 2
8 strong.1 2 1
", header=T)
nodes$ID <- as.character(nodes$ID)
edges$N1 <- as.numeric(factor(edges$N1, levels=c("off","weak","medium","strong")))-1
edges$N2 <- as.numeric(factor(edges$N2, levels=c("off.1","low.1","medium.1","strong.1")))+3
library(networkD3)
sankeyNetwork(Links = edges, Nodes = nodes, Source = "N1",
Target = "N2", Value = "Value", NodeID = "ID",
iterations=0,
units = "", fontSize = 12, nodeWidth = 30)
火花失败并给我foreachAsync
在此代码中:
java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext.
工作正常。但是在这段代码中:
JavaSparkContext sparkContext = new JavaSparkContext("local","MyAppName");
JavaPairRDD<String, String> wholeTextFiles = sparkContext.wholeTextFiles("somePath");
wholeTextFiles.foreach(new VoidFunction<Tuple2<String, String>>() {
public void call(Tuple2<String, String> stringStringTuple2) throws Exception {
//do something
}
});
它返回错误。哪里我错了?
答案 0 :(得分:3)
因为foreachAsync
返回一个Future对象,当你离开一个函数时,spark上下文被关闭(因为它是在本地创建的)。
如果您在get()
上致电foreachAsync()
,那么主线程将等待Future完成。