我希望在处理文件中的100条记录后停止spark中的java流式上下文。问题是流式启动时if语句中的代码没有执行。以下代码将解释我的想法:
public static void main(String[] args) throws Exception {
int ff = testSparkStreaming();
System.out.println("wqwqwq");
System.out.println(ff);
}
public static int testSparkStreaming() throws IOException, InterruptedException {
int numberInst = 0
String savePath = "Path to Model";
final NaiveBayesModel savedModel = NaiveBayesModel.load(jssc.sparkContext().sc(), savePath);
BufferedReader br = new BufferedReader(new FileReader("C://testStream//copy.csv"));
Queue<JavaRDD<String>> rddQueue = new LinkedList<JavaRDD<String>>();
List<String> list = Lists.newArrayList();
String line = "";
while ((line = br.readLine()) != null) {
list.add(line);
}
br.close();
rddQueue.add(jssc.sparkContext().parallelize(list));
numberInst+= list.size();
JavaDStream<String> dataStream = jssc.queueStream(rddQueue);
dataStream.print();
if (numberInst == 100){
System.out.println("should stop");
jssc.wait();
}
jssc.start();
jssc.awaitTermination();
return numberInst;
}
我的问题是如何在numberInst == 100时停止流式处理并将执行移至main方法以运行以下语句。
P.S:在前面的代码中,If语句未执行:
if (numberInst == 100){
System.out.println("should stop");
jssc.wait();
}
答案 0 :(得分:2)
你可以试试这个:
jssc.start();
while (numberInst < 100){
jssc.awaitTerminationOrTimeout(1000); // 1 second polling time, you can change it as per your usecase
}
jssc.stop();
答案 1 :(得分:0)
你试过像线程一样停止这个,我的意思是中断。