我使用此行打印出RDD计数消息:
myDStream.count.print
我得到类似的东西:
-------------------------------------------
Time: 1501499254000 ms
-------------------------------------------
2
-------------------------------------------
Time: 1501499256000 ms
-------------------------------------------
0
-------------------------------------------
Time: 1501499258000 ms
-------------------------------------------
0
我只想像这样重新格式化这条消息:
-------------------------------------------
Time: 1501499254000 ms
-------------------------------------------
log.info Got new batch with 2 messages
-------------------------------------------
Time: 1501499256000 ms
-------------------------------------------
log.info Got new batch with 0 messages
-------------------------------------------
Time: 1501499258000 ms
-------------------------------------------
log.info Got new batch with 0 messages
你有什么想法吗?
答案 0 :(得分:2)
implementation of print
已修复。如果我们想要一个不同的输出,我们需要推出自己的实现:
dstream.foreachRDD{(rdd, time) =>
val count = rdd.count()
println("-------------------------------------------")
println(s"Time: $time")
println("-------------------------------------------")
println(s"log.info Got new batch with $count messages")
}