Question

我正在尝试使用log4j2.xml而不是spark的默认log4j日志记录。

我的Log4j2.xml如下

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE log4j:configuration PUBLIC
  "-//APACHE//DTD LOG4J 1.2//EN" "http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/xml/doc-files/log4j.dtd">

<Configuration status="WARN" name="MyApp" monitorInterval="30">

        <Properties>
            <Property name="appName">MyApp</Property>
            <Property name="appenderPatternLayout">%d{yyyy-MM-dd HH:mm:ss} %c{1} [%p] %m%n</Property>
            <Property name="fileName">/app/vodip/logs/${appName}.log</Property>
        </Properties>

        <Appenders>
            <RollingFile name="RollingFile"
                         fileName="${fileName}"
                         filePattern="a1
                         ${appName}-%d{yyyy-MM-dd-HH}-%i.log">
                <PatternLayout>
                    <Pattern>${appenderPatternLayout}</Pattern>
                </PatternLayout>
                <Policies>
                    <TimeBasedTriggeringPolicy interval="4" modulate="true"/>
                    <SizeBasedTriggeringPolicy size="250 MB"/>
                </Policies>
            </RollingFile>
        </Appenders>


      <Loggers>
          <Logger name="xyz.abcs.MyApp" level="debug" additivity="false">
              <AppenderRef ref="RollingFile"/>
          </Logger>
          <Root level="debug">
              <AppenderRef ref="RollingFile"/>
          </Root>
      </Loggers>

    </Configuration>

我将我的log4j2.xml放在所有节点上的spark / conf文件夹中，然后重新启动spark并提交如下的spark程序。

spark-submit --master spark://xyzzz.net:7077 \
--class abcd.myclass \
--deploy-mode cluster --executor-memory 2G --total-executor-cores 4  \
--conf spark.network.timeout=150 \
--files /app/spark/spark-1.6.1-bin-hadoop2.6/conf/log4j2.xml \
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j2.xml" \
--driver-java-options "-Dlog4j.configuration=file:/app/spark/spark-1.6.1-bin-hadoop2.6/conf/log4j2.xml" \
/app/spark/my.jar

我在我的工作者stderr日志中看到了这个。这意味着我的日志没有使用log4j2功能。

log4j：WARN可持续解析错误10和第78列log4j：WARN 文档根元素＆＃34;配置＆＃34;，必须匹配DOCTYPE root＆＃34; null＆＃34;。 log4j：WARN可持续解析错误10和第78列log4j：WARN 文档无效：未找到语法。 log4j：ERROR DOM元素是 - 不是一个元素。使用Spark的默认log4j profile：org / apache / spark / log4j-defaults.properties

任何人都可以告知配置有什么问题吗？

Answer 1

你是否也将log4j文件放在项目的资源文件夹中（如果它将在那里）然后将其从那里删除并使用log4j为驱动程序和执行程序记录spark应用程序你还应该为驱动程序和执行程序提供log4j的路径如下

with open(rawfile, 'r') as data_file:
    while(data_file.read(1)=='#'):
        last_pound_pos = data_file.tell()
        data_file.readline()
    data_file.seek(last_pound_pos)
    df = pd.read_fwf(data_file)

df
Out[88]: 
   i      mult  stat (+/-)  syst (+/-)        Q2         x       x.1       Php
0  0  0.322541    0.018731    0.026681  1.250269  0.037525  0.148981  0.104192
1  1  0.667686    0.023593    0.033163  1.250269  0.037525  0.150414  0.211203
2  2  0.766044    0.022712    0.037836  1.250269  0.037525  0.149641  0.316589
3  3  0.668402    0.024219    0.031938  1.250269  0.037525  0.148027  0.415451
4  4  0.423496    0.020548    0.018001  1.250269  0.037525  0.154227  0.557743
5  5  0.237175    0.023561    0.007481  1.250269  0.037525  0.159904  0.750544

您还可以参阅此博客了解更多详情https://blog.knoldus.com/2016/02/23/logging-spark-application-on-standalone-cluster/

Answer 2

命令行中至少有一个错误可能导致此错误

-Dlog4j.configuration=. . . 实际上必须是 -Dlog4j.configurationFile=. . . 使用 log4j2

时

log4j.configuration由旧的log4j解析，显然不了解新的配置格式并抛出解析错误

Spark没有采用log4j2.xml

2 个答案: