我正在研究PySpark应用程序,并将其部署为纱线群集模式。我已经将stdout作为日志流处理程序。我可以在YARN UI中看到日志。但是,我在/ var / log / sparkapp / yarn下找不到stdout
个日志。我只看到那里的stderr日志。这可能是什么原因?
这是我在应用程序中的日志记录部分
import logging
import sys
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
lsh = logging.StreamHandler(sys.stdout)
lsh.setLevel(logging.INFO)
lformat = logging.Formatter(fmt='%(asctime)s.%(msecs)03d %(levelname)s :%(name)s - %(message)s', datefmt='%m/%d/%Y %I:%M:%S')
lsh.setFormatter(lformat)
logger.addHandler(lsh)
log4j.properties
log4jspark.root.logger=INFO,console
log4jspark.log.dir=.
log4jspark.log.file=spark.log
log4jspark.log.maxfilesize=1024MB
log4jspark.log.maxbackupindex=10
# Define the root logger to the system property "spark.root.logger".
log4j.rootLogger=${log4jspark.root.logger}, EventCounter
# Set everything to be logged to the console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
log4j.appender.console.Threshold=INFO
# Settings to quiet third party logs that are too verbose
log4j.logger.org.eclipse.jetty=WARN
log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
答案 0 :(得分:0)
尝试使用此代码来获取spark作业的记录器:
log4jLogger = sc._jvm.org.apache.log4j
logger = log4jLogger.LogManager.getLogger(__name__)
您可以修改log4j.properties
以更改target
文件:
log4j.appender.console.target=System.out