如何将Spark输出重定向到控制台?

时间:2015-10-21 23:00:24

标签: hadoop apache-spark cloudera yarn

我在CHD群集上运行spark作业,所有日志都作为文本文件存储到历史记录服务器中。有没有办法让这些输出在控制台上打印?我所看到的只是

15/10/21 15:47:09 INFO yarn.Client: Application report for application_1445455790310_0014 (state: ACCEPTED)
15/10/21 15:47:10 INFO yarn.Client: Application report for application_1445455790310_0014 (state: ACCEPTED)
15/10/21 15:47:11 INFO yarn.Client: Application report for application_1445455790310_0014 (state: ACCEPTED)
15/10/21 15:47:12 INFO yarn.Client: Application report for application_1445455790310_0014 (state: ACCEPTED)
15/10/21 15:47:13 INFO yarn.Client: Application report for application_1445455790310_0014 (state: ACCEPTED)
15/10/21 15:47:14 INFO yarn.Client: Application report for application_1445455790310_0014 (state: ACCEPTED)
15/10/21 15:47:15 INFO yarn.Client: Application report for application_1445455790310_0014 (state: ACCEPTED)
15/10/21 15:47:16 INFO yarn.Client: Application report for application_1445455790310_0014 (state: ACCEPTED)
15/10/21 15:47:17 INFO yarn.Client: Application report for application_1445455790310_0014 (state: ACCEPTED)
15/10/21 15:47:18 INFO yarn.Client: Application report for application_1445455790310_0014 (state: ACCEPTED)
15/10/21 15:47:19 INFO yarn.Client: Application report for application_1445455790310_0014 (state: RUNNING)

1 个答案:

答案 0 :(得分:0)

Spark日志记录系统与log4j一起使用。

根据官方documentation,Spark提供了一个名为log4j.properties.template的配置文件,您可以在其中指定不同的日志记录属性。 该文件位于Spark主目录中的文件夹conf下。

为了让Spark检测并使用此配置文件,有必要重命名它并从中移除.template

默认模板类似于以下内容:

# Set everything to be logged to the console
log4j.rootCategory=INFO, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n

# Settings to quiet third party logs that are too verbose
log4j.logger.org.spark-project.jetty=WARN
log4j.logger.org.spark-project.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
log4j.logger.org.apache.parquet=ERROR
log4j.logger.parquet=ERROR

...

在此示例中,默认日志记录方法设置为console。虽然我还没有对它进行测试,但您应该可以通过此示例模板查看输出框。