我正在使用HDP 2.1。对于群集。我遇到了以下异常,因此MapReduce作业失败了。实际上,我们使用Flume的数据定期创建表格。 1.4。我检查了映射器试图读取的数据文件,但我无法找到任何内容。
2014-11-28 00:08:28,696 WARN [main] org.apache.hadoop.metrics2.impl.MetricsConfig: Cannot locate
configuration: tried hadoop-metrics2-maptask.properties,hadoop-metrics2.properties
2014-11-28 00:08:28,947 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2014-11-28 00:08:28,947 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started
2014-11-28 00:08:28,995 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:
2014-11-28 00:08:29,009 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1417095534232_0051, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@ea23517)
2014-11-28 00:08:29,184 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.
2014-11-28 00:08:29,735 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /hadoop1/hadoop/yarn/local/usercache/xxx/appcache/application_1417095534232_0051,/hadoop2/hadoop/yarn/local/usercache/xxx/appcache/application_1417095534232_0051,/hadoop3/hadoop/yarn/local/usercache/xxx/appcache/application_1417095534232_0051,/hadoop4/hadoop/yarn/local/usercache/xxx/appcache/application_1417095534232_0051,/hadoop5/hadoop/yarn/local/usercache/xxx/appcache/application_1417095534232_0051,/hadoop6/hadoop/yarn/local/usercache/xxx/appcache/application_1417095534232_0051
2014-11-28 00:08:31,067 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
2014-11-28 00:08:32,806 INFO [main] org.apache.hadoop.mapred.Task: Using ResourceCalculatorProcessTree : [ ]
2014-11-28 00:08:33,837 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: com.hadoop.mapred.DeprecatedLzoTextInputFormat:hdfs://cluster/apps/hive/external/mapp_log/dt=2014-11-27/mapp_parse_log.1417014001075:402653184+67311787
2014-11-28 00:08:34,196 INFO [main] org.apache.hadoop.hive.ql.log.PerfLogger: <PERFLOG method=deserializePlan from=org.apache.hadoop.hive.ql.exec.Utilities>
2014-11-28 00:08:34,196 INFO [main] org.apache.hadoop.hive.ql.exec.Utilities: Deserializing MapWork via kryo
2014-11-28 00:08:35,222 INFO [main] org.apache.hadoop.hive.ql.log.PerfLogger: </PERFLOG method=deserializePlan start=1417100914196 end=1417100915222 duration=1026 from=org.apache.hadoop.hive.ql.exec.Utilities>
2014-11-28 00:08:35,254 INFO [main] com.hadoop.compression.lzo.GPLNativeCodeLoader: Loaded native gpl library
2014-11-28 00:08:35,260 INFO [main] com.hadoop.compression.lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev dbd51f0fb61f5347228a7a23fe0765ac1242fcdf]
2014-11-28 00:08:35,498 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1879195946-xx.xx.xx.32-1409281631059:blk_1075462091_1722425; getBlockSize()=202923; corrupt=false; offset=469762048; locs=[xx.xx.xx.36:50010, xx.xx.xx.37:50010]}
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:241)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:573)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:168)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:409)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Unknown Source)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1879195946-xx.xx.xx.32-1409281631059:blk_1075462091_1722425; getBlockSize()=202923; corrupt=false; offset=469762048; locs=[xx.xx.xx.36:50010, xx.xx.xx.37:50010]}
at org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:350)
at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:294)
at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:231)
at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:224)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1295)
at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:300)
at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:296)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:296)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:764)
at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:108)
at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
at com.hadoop.mapred.DeprecatedLzoTextInputFormat.getRecordReader(DeprecatedLzoTextInputFormat.java:161)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:239)
... 9 more
2014-11-28 00:08:35,503 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task
如果您对此问题有一些想法或解决方案,请与我分享....
谢谢 KWANGWOO
答案 0 :(得分:2)
以下是对问题及其原因的详细描述:
https://community.hortonworks.com/answers/37414/view.html
对我们来说,运行命令hdfs debug recoverLease -path <path-of-the-file> -retries 3
解决了这个问题。
答案 1 :(得分:0)
很难确定任何HDFS文件夹中的文件是否未关闭。您可能需要对它们进行hdfs cat测试。或者,您可以定期检查丢失的文件块(每小时或每次重新启动群集后)。
答案 2 :(得分:0)
我和你有同样的问题。有一些文件由水槽打开但从未关闭(我不确定原因)。您需要通过命令找到它们的名称:
hdfs fsck / directory / of / locked / files / -files -openforwrite
然后删除它们。
或者您可以尝试将文件恢复为Joe23建议的命令hdfs debug recoverLease -path <path-of-the-file> -retries 3
。