Question

我在Accumulo中有一堆表，其中一个主服务器和两个平板电脑服务器包含一堆存储数百万条记录的表。问题是，每当我扫描表格以获取一些记录时，平板电脑服务器日志就会不断抛出此错误

2015-11-12 04:38:56,107 [hdfs.DFSClient] WARN : Failed to connect to /192.168.250.12:50010 for block, add to deadNodes and continue. java.io.IOException: Got error, status message opReadBlock BP-1881591466-192.168.1.111-1438767154643:blk_1073773956_33167 received exception java.io.IOException:  Offset 16320 and length 20 don't match block BP-1881591466-192.168.1.111-1438767154643:blk_1073773956_33167 ( blockLen 0 ), for OP_READ_BLOCK, self=/192.168.250.202:55915, remote=/192.168.250.12:50010, for file /accumulo/tables/1/default_tablet/F0000gne.rf, for pool BP-1881591466-192.168.1.111-1438767154643 block 1073773956_33167
java.io.IOException: Got error, status message opReadBlock BP-1881591466-192.168.1.111-1438767154643:blk_1073773956_33167 received exception java.io.IOException:  Offset 16320 and length 20 don't match block BP-1881591466-192.168.1.111-1438767154643:blk_1073773956_33167 ( blockLen 0 ), for OP_READ_BLOCK, self=/192.168.250.202:55915, remote=/192.168.250.12:50010, for file /accumulo/tables/1/default_tablet/F0000gne.rf, for pool BP-1881591466-192.168.1.111-1438767154643 block 1073773956_33167
 at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:140)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:456)
        at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:424)
        at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:818)
        at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:697)
        at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:355)
        at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:618)
        at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:844)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:896)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:697)
        at java.io.DataInputStream.readShort(DataInputStream.java:312)
        at org.apache.accumulo.core.file.rfile.bcfile.Utils$Version.<init>(Utils.java:264)
        at org.apache.accumulo.core.file.rfile.bcfile.BCFile$Reader.<init>(BCFile.java:823)
        at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.init(CachableBlockFile.java:246)
        at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBCFile(CachableBlockFile.java:257)
        at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.access$100(CachableBlockFile.java:137)
        at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader$MetaBlockLoader.get(CachableBlockFile.java:209)
        at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBlock(CachableBlockFile.java:313)
        at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:368)
        at org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:137)
        at org.apache.accumulo.core.file.rfile.RFile$Reader.<init>(RFile.java:843)
        at org.apache.accumulo.core.file.rfile.RFileOperations.openReader(RFileOperations.java:79)
        at org.apache.accumulo.core.file.DispatchingFileFactory.openReader(DispatchingFileFactory.java:69)
        at org.apache.accumulo.tserver.tablet.Compactor.openMapDataFiles(Compactor.java:279)
        at org.apache.accumulo.tserver.tablet.Compactor.compactLocalityGroup(Compactor.java:322)
        at org.apache.accumulo.tserver.tablet.Compactor.call(Compactor.java:214)
        at org.apache.accumulo.tserver.tablet.Tablet._majorCompact(Tablet.java:1976)
        at org.apache.accumulo.tserver.tablet.Tablet.majorCompact(Tablet.java:2093)
        at org.apache.accumulo.tserver.tablet.CompactionRunner.run(CompactionRunner.java:44)
        at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
        at java.lang.Thread.run(Thread.java:745)

我认为这与HDFS相关的问题更多，而不是Accumulo问题，所以我检查了datanode的日志，发现了相同的消息，

Offset 16320 and length 20 don't match block BP-1881591466-192.168.1.111-1438767154643:blk_1073773956_33167 ( blockLen 0 ), for OP_READ_BLOCK, self=/192.168.250.202:55915, remote=/192.168.250.12:50010, for file /accumulo/tables/1/default_tablet/F0000gne.rf, for pool BP-1881591466-192.168.1.111-1438767154643 block 1073773956_33167

但是作为日志中的INFO。我不明白的是，为什么我会收到这个错误。

我可以看到我尝试访问的文件的池名称（BP-1881591466-192.168.1.111-1438767154643）包含的IP地址（192.168.1.111）与任何IP地址都不匹配服务器（自我和远程）。实际上，192.168.1.111是Hadoop主服务器的旧IP地址，但我已经改变了它。我使用域名而不是IP地址，因此我进行更改的唯一位置是群集中计算机的主机文件。 Hadoop / Accumulo配置都不使用IP地址。有谁知道这里的问题是什么？我花了好几天时间，仍然无法弄明白。

Answer 1

您收到的错误表明Accumulo无法从HDFS读取其中一个文件的一部分。 NameNode报告块位于特定DataNode上（在您的情况下为192.168.250.12）。但是，当Accumulo尝试从该DataNode读取时，它会失败。

这可能表示HDFS中的损坏块或临时网络问题。您可以尝试运行hadoop fsck /（具体命令可能因版本而异）以执行HDFS的运行状况检查。

此外，DataNode中的IP地址不匹配似乎表明DataNode对其所属的HDFS池感到困惑。您应该在仔细检查其配置，DNS和/etc/hosts之后重新启动该DataNode以获取任何异常情况。

扫描数据时，Accumulo平板电脑服务器出错

1 个答案: