以编程方式减少火花壳中的滞后

时间:2015-01-09 11:26:18

标签: scala shell apache-spark

我可以通过删除所有"INFO"标志以编程方式减少火花壳中的日志吗? 它是垃圾邮件我的窗口,我无法分析实际输出 例如:

15/01/09 12:23:02 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 649 bytes result sent to driver
15/01/09 12:23:02 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 500 ms on localhost (1/1)
15/01/09 12:23:02 INFO DAGScheduler: Stage 0 (count at MainApp.scala:31) finished in 0.520 s
15/01/09 12:23:02 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
15/01/09 12:23:02 INFO DAGScheduler: Job 0 finished: count at MainApp.scala:31, took 0.639191 s

如果您有一些替代方案,请寻找替代方案!

3 个答案:

答案 0 :(得分:9)

这会删除大多数信息消息(尽管不是全部)

import org.apache.log4j.{Level, Logger}
// ...
val level = Level.WARN
Logger.getLogger("org").setLevel(level)
Logger.getLogger("akka").setLevel(level)

或者作为util方法:

def setLogLevel(level: String): Unit = {
  setLogLevel(Level.toLevel(level, Level.INFO))
}

def setLogLevel(level: Level): Unit = {
  Logger.getLogger("org").setLevel(level)
  Logger.getLogger("akka").setLevel(level)
}

答案 1 :(得分:0)

spark-submit --options artifact.jar 2>stderr.log

然后在新窗口中使用tail -f stderr.log

这样可以保持窗户清洁stdout并在需要时能够到达日志

答案 2 :(得分:0)

只需在Spark源代码中添加它,它就会删除日志

import org.apache.log4j.Logger
import org.apache.log4j.Level

Logger.getLogger("org").setLevel(Level.OFF)
Logger.getLogger("akka").setLevel(Level.OFF)