读取StreamingContext.textfilestream文件,但控制台上不显示任何结果

时间:2016-06-15 14:07:08

标签: apache-spark

def textfile={
   val ssc = new StreamingContext(conf, Seconds(10))

   val lines = ssc.textFileStream("hdfs://master:9000/streaming/")
   val words = lines.flatMap(_.split("\\s"));

   val pairs = words.map(word => (word, 1));
   val wordCounts = pairs.reduceByKey(_ + _);

   wordCounts.print();
   ssc.start();
   ssc.awaitTermination();

}

结果未显示

enter image description here

1 个答案:

答案 0 :(得分:0)

textFileStream仅在启动流应用程序后扫描新文件。如果要扫描现有文件,可以使用以下解决方法:

fileStream[LongWritable, Text, TextInputFormat](
  directory,
  filter = path => !path.getName().startsWith("."),
  newFilesOnly = false).map(_._2.toString)