Spark Scala file stream

时间:2018-03-25 18:47:18

标签: scala apache-spark spark-streaming rdd

I am new to Spark and Scala. I want to keep read files from folder and persist file content in Cassandra. I have written simple Scala program using file streaming to read the file content. it is not reading files from the specified folder.

Can anybody correct my below sample code ?

i am using Windows 7

Code:

 val spark = SparkHelper.getOrCreateSparkSession()
val ssc = new StreamingContext(spark.sparkContext, Seconds(1))
val lines = ssc.textFileStream("file:///C:/input/")
lines.foreachRDD(file=> {
  file.foreach(fc=> {
    println(fc)
  })
})
ssc.start()
ssc.awaitTermination()

}

1 个答案:

答案 0 :(得分:0)

我认为场景需要正常的火花作业,而不是火花流。如果您的源是类似kafka或正常端口的数据持续流入,则使用Spark流式传输。