使用log4j从spark到hdfs登录没有水槽

时间:2015-10-22 11:32:44

标签: logging apache-spark log4j hdfs yarn

我在cdh 5.3群集上有一个火花1.2.0。

由于jar中捆绑了自定义的log4j.properties文件,我设法将我的spark应用程序日志记录到本地文件系统。这种情况很好,直到在纱线客户端模式下启动火花,但在纱线群集模式下不可行,因为无法知道驱动器在哪台机器上运行。

我看了一下纱线日志聚合器,查看了在h dfs://nameservice1/user/spark/applicationHistory/application_1444387971657_0470/*中生成的文件,这与普通文件系统上的文件完全不匹配,但是这样的信息

{"Event":"SparkListenerTaskEnd","Stage ID":1314,"Stage Attempt ID":0,"Task Type":"ResultTask","Task End Reason":{"Reason":"Success"},"Task Info":{"Task ID":3120,"Index":1,"Attempt":0,"Launch Time":1445512311024,"Executor ID":"3","Host":"usqrtpl5328.internal.unicreditgroup.eu","Locality":"RACK_LOCAL","Speculative":false,"Getting Result Time":0,"Finish Time":1445512311685,"Failed":false,"Accumulables":[]},"Task Metrics":{"Host Name":"usqrtpl5328.internal.unicreditgroup.eu","Executor Deserialize Time":5,"Executor Run Time":652,"Result Size":1768,"JVM GC Time":243,"Result Serialization Time":0,"Memory Bytes Spilled":0,"Disk Bytes Spilled":0,"Shuffle Read Metrics":{"Remote Blocks Fetched":26,"Local Blocks Fetched":10,"Fetch Wait Time":0,"Remote Bytes Read":16224},"Output Metrics":{"Data Write Method":"Hadoop","Bytes Written":82983}}}

现在有办法记录所有我想要的HDFS吗?

欢迎任何建议

EDIT 当我发布我的时候,我看过this question。它不能解决我的问题,因为我需要登录到HDFS并且没有考虑到这一点。

我甚至不知道是否可以直接使用log4j登录到HDFS,如果您对如何相应地编写log4j.properties有任何想法,请分享

0 个答案:

没有答案