Spark Streaming不从Windows中的本地目录中读取文件

时间:2015-11-07 16:34:54

标签: apache-spark real-time spark-streaming

public class StreamingWordCount implements Serializable { 

public static void main(String[] args) {    

    JavaStreamingContext jssc = new JavaStreamingContext("local[2]", "JavaWordCount",
            new Duration(1000));
    JavaDStream<String> data = jssc.textFileStream("D:/krishna/").cache();
    data.foreach(new Function<JavaRDD<String>, Void>() {

        public Void call(JavaRDD<String> rdd) throws Exception {
            List<String> output = rdd.collect();
            System.out.println("Sentences Collected from files " + output);
            return null;
        }
    });

    data.print();
    jssc.start();
    jssc.awaitTermination();
  }
}

我在Windows 8上使用Spark standalone。

JavaStreamingContext是否仅适用于HDFS目录?

点击输出:This is output on console, Sentence collected from files is empty. I tried changing the directory, files. But still code is not picking files. This is my first code in Spark Streaming. Kindly help.

1 个答案:

答案 0 :(得分:0)

将批处理时间间隔减少 5 ms至10 ms 如果任何作业将超过提供的批处理时间,则jssc.awaitTermination()方法将等待完成待处理作业并自动处理批处理时间