我最近遇到过这样的情况,其中MapReduce作业似乎在RM中成功,其中PIG脚本返回时带有退出代码8,引用" Throwable抛出(意外异常)"
按要求添加了脚本:
REGISTER '$LIB_LOCATION/*.jar';
-- set number of reducers to 200
SET default_parallel $REDUCERS;
SET mapreduce.map.memory.mb 3072;
SET mapreduce.reduce.memory.mb 6144;
SET mapreduce.map.java.opts -Xmx2560m;
SET mapreduce.reduce.java.opts -Xmx5120m;
SET mapreduce.job.queuename dt_pat_merchant;
SET yarn.app.mapreduce.am.command-opts -Xmx5120m;
SET yarn.app.mapreduce.am.resource.mb 6144;
-- load data from EAP data catalog using given ($ENV = PROD)
data = LOAD 'eap-$ENV://event'
-- using a custom function
USING com.XXXXXX.pig.DataDumpLoadFunc
('{"startDate": "$START_DATE", "endDate" : "$END_DATE", "timeType" : "$TIME_TYPE", "fileStreamType":"$FILESTREAM_TYPE", "attributes": { "all": "true" } }', '$MAPPING_XML_FILE_PATH');
-- filter out null context entity records
filtered = FILTER data BY (attributes#'context_id' IS NOT NULL);
-- group data by session id
session_groups = GROUP filtered BY attributes#'context_id';
-- flatten events
flattened_events = FOREACH session_groups GENERATE FLATTEN(filtered);
-- remove the output directory if exists
RMF $OUTPUT_PATH;
-- store results in specified output location
STORE flattened_events INTO '$OUTPUT_PATH' USING com.XXXX.data.catalog.pig.EventStoreFunc();
我可以看到" ERROR 2998:未处理的内部错误。超出GC开销限制"在猪日志中。(记录如下)
Pig Stack Trace
---------------
ERROR 2998: Unhandled internal error. GC overhead limit exceeded
java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.hadoop.mapreduce.FileSystemCounter.values(FileSystemCounter.java:23)
at org.apache.hadoop.mapreduce.counters.FileSystemCounterGroup.findCounter(FileSystemCounterGroup.java:219)
at org.apache.hadoop.mapreduce.counters.FileSystemCounterGroup.findCounter(FileSystemCounterGroup.java:199)
at org.apache.hadoop.mapreduce.counters.FileSystemCounterGroup.findCounter(FileSystemCounterGroup.java:210)
at org.apache.hadoop.mapreduce.counters.AbstractCounters.findCounter(AbstractCounters.java:154)
at org.apache.hadoop.mapreduce.TypeConverter.fromYarn(TypeConverter.java:241)
at org.apache.hadoop.mapreduce.TypeConverter.fromYarn(TypeConverter.java:370)
at org.apache.hadoop.mapreduce.TypeConverter.fromYarn(TypeConverter.java:391)
at org.apache.hadoop.mapred.ClientServiceDelegate.getTaskReports(ClientServiceDelegate.java:451)
at org.apache.hadoop.mapred.YARNRunner.getTaskReports(YARNRunner.java:594)
at org.apache.hadoop.mapreduce.Job$3.run(Job.java:545)
at org.apache.hadoop.mapreduce.Job$3.run(Job.java:543)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapreduce.Job.getTaskReports(Job.java:543)
at org.apache.pig.backend.hadoop.executionengine.shims.HadoopShims.getTaskReports(HadoopShims.java:235)
at org.apache.pig.tools.pigstats.mapreduce.MRJobStats.addMapReduceStatistics(MRJobStats.java:352)
at org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil.addSuccessJobStats(MRPigStatsUtil.java:233)
at org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil.accumulateStats(MRPigStatsUtil.java:165)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:360)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:282)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1431)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1416)
at org.apache.pig.PigServer.execute(PigServer.java:1405)
at org.apache.pig.PigServer.executeBatch(PigServer.java:456)
at org.apache.pig.PigServer.executeBatch(PigServer.java:439)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:234)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:624)
pig脚本中的配置如下所示:
SET default_parallel 200;
SET mapreduce.map.memory.mb 3072;
SET mapreduce.reduce.memory.mb 6144;
SET mapreduce.map.java.opts -Xmx2560m;
SET mapreduce.reduce.java.opts -Xmx5120m;
SET mapreduce.job.queuename dt_pat_merchant;
SET yarn.app.mapreduce.am.command-opts -Xmx5120m;
SET yarn.app.mapreduce.am.resource.mb 6144;
群组RM中作业的状态表示作业成功[由于我的声誉太低,无法发布图像;)]
此问题经常发生,我们必须重新启动作业才能成功完成工作。
请告诉我一个解决方法。
PS:作业正在运行的集群是世界上最大的集群之一,所以不用担心资源或我说的存储空间。
由于
答案 0 :(得分:0)
来自oracle docs:
在垃圾收集之后,如果Java进程花费超过大约98%的时间进行垃圾收集,并且它正在恢复少于2%的堆并且到目前为止已经执行了最后5次(编译时常量) )连续垃圾收集,然后抛出java.lang.OutOfMemoryError可以使用命令行标志-XX关闭超出GC Overhead限制的java.lang.OutOfMemoryError异常:-UseGCOverheadLimit
如文档中所述,您可以关闭此异常或增加堆大小。
答案 1 :(得分:0)
你能在这里添加猪脚本吗?
我认为,你得到这个错误,因为猪本身(不是映射器和减速器)无法处理输出。
如果您使用DUMP
操作脚本,则首先尝试限制显示的数据集。我们假设您的数据有X
别名。尝试:
temp = LIMIT X 1;
DUMP temp;
因此,您只会看到一条记录并节省一些资源。您也可以执行STORE
操作(请参阅猪手册中的操作方法)。
显然,您可以将pig的堆大小配置为更大,但pig的堆大小不是 mapreduce.map 或 mapreduce.reduce 。使用PIG_HEAPSIZE
环境变量来执行此操作。