自定义记录器到Spark中的文件

时间:2018-01-12 08:33:50

标签: apache-spark logging

我想在Spark中创建一个自定义记录器。我想从本地文件中的执行程序发送一些消息以进行调试。我试着遵循这个tutorial,所以我编辑了这样的log4j.properties文件来创建一个cisutom looger,将日志保存在/mypath/sparkU.log中:

# My added lines
log4j.logger.myLogger=WARN, RollingAppenderU 
log4j.appender.RollingAppenderU=org.apache.log4j.DailyRollingFileAppender
log4j.appender.RollingAppenderU.File=/mypath/sparkU.log
log4j.appender.RollingAppenderU.DatePattern='.'yyyy-MM-dd
log4j.appender.RollingAppenderU.layout=org.apache.log4j.PatternLayout
log4j.appender.RollingAppenderU.layout.ConversionPattern=[%p] %d %c %M - %m%n

log4j.rootLogger=${root.logger}
root.logger=WARN,console       
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n
shell.log.level=WARN
log4j.logger.org.eclipse.jetty=WARN
log4j.logger.org.spark-project.jetty=WARN
log4j.logger.org.spark-project.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
log4j.logger.org.apache.parquet=ERROR
log4j.logger.parquet=ERROR
log4j.logger.org.apache.hadoop.hive.metastore.RetryingHMSHandler=FATAL
log4j.logger.org.apache.hadoop.hive.ql.exec.FunctionRegistry=ERROR
log4j.logger.org.apache.spark.repl.Main=${shell.log.level}
log4j.logger.org.apache.spark.api.python.PythonGatewayServer=${shell.log.level}

然后我用spark运行提交这个(我通常在Python工作,但语言不是问题):

from pyspark import SparkConf, SparkContext
from pyspark.sql import SQLContext, SparkSession
from pyspark.sql.types import *

spark = SparkSession \
        .builder \
        .master("yarn") \
        .appName("test custom logging") \
        .config("spark.some.config.option", "some-value") \
        .getOrCreate()

log4jLogger = spark.sparkContext._jvm.org.apache.log4j 

log = log4jLogger.LogManager.getLogger(__name__) 

log.error("Hello demo")

log.error("I am done")

print 'hello from print'

但是创建文件SparkU.log时它是空的。正确创建控制台和hdfs中的Spark日志。为什么日志文件为空,哪个是正确的方法来做这样的事情?我在Yarn下使用Spark 2.1,我使用Cloudera发行版。谢谢你的建议。

0 个答案:

没有答案