地图减少完成但猪工作失败

时间:2017-04-24 07:38:40

标签: hadoop mapreduce apache-pig

我最近遇到过这样的情况,其中MapReduce作业似乎在RM中成功,其中PIG脚本返回时带有退出代码8,引用" Throwable抛出(意外异常)"

按要求添加了脚本:

REGISTER '$LIB_LOCATION/*.jar'; 

-- set number of reducers to 200
SET default_parallel $REDUCERS;
SET mapreduce.map.memory.mb 3072;
SET mapreduce.reduce.memory.mb 6144;

SET mapreduce.map.java.opts -Xmx2560m;
SET mapreduce.reduce.java.opts -Xmx5120m;
SET mapreduce.job.queuename dt_pat_merchant;

SET yarn.app.mapreduce.am.command-opts -Xmx5120m;
SET yarn.app.mapreduce.am.resource.mb 6144;

-- load data from EAP data catalog using given ($ENV = PROD)
data = LOAD 'eap-$ENV://event'
-- using a custom function
USING com.XXXXXX.pig.DataDumpLoadFunc
('{"startDate": "$START_DATE", "endDate" : "$END_DATE", "timeType" : "$TIME_TYPE", "fileStreamType":"$FILESTREAM_TYPE", "attributes": { "all": "true" } }', '$MAPPING_XML_FILE_PATH');

-- filter out null context entity records
filtered = FILTER data BY (attributes#'context_id' IS NOT NULL);

-- group data by session id
session_groups = GROUP filtered BY attributes#'context_id';

-- flatten events
flattened_events = FOREACH session_groups GENERATE FLATTEN(filtered);

-- remove the output directory if exists
RMF $OUTPUT_PATH;

-- store results in specified output location
STORE flattened_events INTO '$OUTPUT_PATH' USING com.XXXX.data.catalog.pig.EventStoreFunc();

我可以看到" ERROR 2998:未处理的内部错误。超出GC开销限制"在猪日志中。(记录如下)

Pig Stack Trace
---------------
ERROR 2998: Unhandled internal error. GC overhead limit exceeded

java.lang.OutOfMemoryError: GC overhead limit exceeded
        at org.apache.hadoop.mapreduce.FileSystemCounter.values(FileSystemCounter.java:23)
        at org.apache.hadoop.mapreduce.counters.FileSystemCounterGroup.findCounter(FileSystemCounterGroup.java:219)
        at org.apache.hadoop.mapreduce.counters.FileSystemCounterGroup.findCounter(FileSystemCounterGroup.java:199)
        at org.apache.hadoop.mapreduce.counters.FileSystemCounterGroup.findCounter(FileSystemCounterGroup.java:210)
        at org.apache.hadoop.mapreduce.counters.AbstractCounters.findCounter(AbstractCounters.java:154)
        at org.apache.hadoop.mapreduce.TypeConverter.fromYarn(TypeConverter.java:241)
        at org.apache.hadoop.mapreduce.TypeConverter.fromYarn(TypeConverter.java:370)
        at org.apache.hadoop.mapreduce.TypeConverter.fromYarn(TypeConverter.java:391)
        at org.apache.hadoop.mapred.ClientServiceDelegate.getTaskReports(ClientServiceDelegate.java:451)
        at org.apache.hadoop.mapred.YARNRunner.getTaskReports(YARNRunner.java:594)
        at org.apache.hadoop.mapreduce.Job$3.run(Job.java:545)
        at org.apache.hadoop.mapreduce.Job$3.run(Job.java:543)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at org.apache.hadoop.mapreduce.Job.getTaskReports(Job.java:543)
        at org.apache.pig.backend.hadoop.executionengine.shims.HadoopShims.getTaskReports(HadoopShims.java:235)
        at org.apache.pig.tools.pigstats.mapreduce.MRJobStats.addMapReduceStatistics(MRJobStats.java:352)
        at org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil.addSuccessJobStats(MRPigStatsUtil.java:233)
        at org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil.accumulateStats(MRPigStatsUtil.java:165)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:360)
        at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:282)
        at org.apache.pig.PigServer.launchPlan(PigServer.java:1431)
        at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1416)
        at org.apache.pig.PigServer.execute(PigServer.java:1405)
        at org.apache.pig.PigServer.executeBatch(PigServer.java:456)
        at org.apache.pig.PigServer.executeBatch(PigServer.java:439)
        at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:234)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
        at org.apache.pig.Main.run(Main.java:624)

pig脚本中的配置如下所示:

SET default_parallel 200;
SET mapreduce.map.memory.mb 3072;
SET mapreduce.reduce.memory.mb 6144;

SET mapreduce.map.java.opts -Xmx2560m;
SET mapreduce.reduce.java.opts -Xmx5120m;
SET mapreduce.job.queuename dt_pat_merchant;

SET yarn.app.mapreduce.am.command-opts -Xmx5120m;
SET yarn.app.mapreduce.am.resource.mb 6144;

群组RM中作业的状态表示作业成功[由于我的声誉太低,无法发布图像;)]

此问题经常发生,我们必须重新启动作业才能成功完成工作。

请告诉我一个解决方法。

PS:作业正在运行的集群是世界上最大的集群之一,所以不用担心资源或我说的存储空间。

由于

2 个答案:

答案 0 :(得分:0)

来自oracle docs

在垃圾收集之后,如果Java进程花费超过大约98%的时间进行垃圾收集,并且它正在恢复少于2%的堆并且到目前为止已经执行了最后5次(编译时常量) )连续垃圾收集,然后抛出java.lang.OutOfMemoryError可以使用命令行标志-XX关闭超出GC Overhead限制的java.lang.OutOfMemoryError异常:-UseGCOverheadLimit

如文档中所述,您可以关闭此异常或增加堆大小。

答案 1 :(得分:0)

你能在这里添加猪脚本吗?

我认为,你得到这个错误,因为猪本身(不是映射器和减速器)无法处理输出。 如果您使用DUMP操作脚本,则首先尝试限制显示的数据集。我们假设您的数据有X别名。尝试:

temp = LIMIT X 1;
DUMP temp;

因此,您只会看到一条记录并节省一些资源。您也可以执行STORE操作(请参阅猪手册中的操作方法)。

显然,您可以将pig的堆大小配置为更大,但pig的堆大小不是 mapreduce.map mapreduce.reduce 。使用PIG_HEAPSIZE环境变量来执行此操作。