根据要求,我想保留一些火花主日志,以便在发生错误时记录日志。我知道webUI上有工作人员登录,但我不确定他们是否会显示与主人相同的错误。
我发现我们必须修改conf / log4j.properties但我的尝试不起作用..
默认配置+添加文件:
# Set everything to be logged to the console
log4j.rootCategory=INFO, console, file
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
# Settings to quiet third party logs that are too verbose
log4j.logger.org.spark-project.jetty=WARN
log4j.logger.org.spark-
project.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
log4j.logger.org.apache.parquet=ERROR
log4j.logger.parquet=ERROR
# SPARK-9183: Settings to avoid annoying messages when looking up
nonexistent UDFs in SparkSQL with Hive support
log4j.logger.org.apache.hadoop.hive.metastore.RetryingHMSHandler=FATAL
log4j.logger.org.apache.hadoop.hive.ql.exec.FunctionRegistry=ERROR
尝试设置文件
###Custom log file
log4j.appender.file=org.apache.log4j.RollingFileAppender
log4j.appender.file.fileName=/var/data/log/MasterLogs/master.log
log4j.appender.file.ImmediateFlush=true
## Set the append to false, overwrite
log4j.appender.file.Append=false
log4j.appender.file.MaxFileSize=100MB
log4j.appender.file.MaxBackupIndex=10
##Define the layout for file appender
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n
答案 0 :(得分:3)
您需要为驱动程序和执行程序创建2个log4j.properties文件。并使用spark提交提交您的应用程序时将它们路由到驱动程序和执行程序的java选项中
spark-submit --class MAIN_CLASS --driver-java-options "-Dlog4j.configuration=file:PATH_OF_LOG4J" --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:PATH_OF_LOG4J" --master MASTER_IP:PORT JAR_PATH
您还可以查看此博客以获取更多详细信息https://blog.knoldus.com/2016/02/23/logging-spark-application-on-standalone-cluster/
答案 1 :(得分:2)
按照此命令。它会将输出和控制台日志写入文件
hadoop @ osboxes:〜/ spark-2.0.1-bin-hadoop2.7 / bin $ ./spark-submit test.py> tempoutfile.txt 2>& 1