我基本上试图将数据附加到HDFS中已存在的文件中。这是我得到的例外
03:49:54,456WARN org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run:628 DataStreamer Exception
java.lang.NullPointerException
at com.google.protobuf.AbstractMessageLite$Builder.checkForNullValues(AbstractMessageLite.java:336)
at com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:323)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$UpdatePipelineRequestProto$Builder.addAllStorageIDs(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.updatePipeline(ClientNamenodeProtocolTranslatorPB.java:842)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1238)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:532)
我的复制因素是1.我使用2.5.0的Apache的Hadoop发行版。这是我用于创建文件的代码段,如果它不存在或者在附加模式下创建(如果存在)
String url = getHadoopUrl()+fileName;
Path file = new Path(url);
try {
if(append) {
if(hadoopFileSystem.exists(file))
fsDataOutputStream = hadoopFileSystem.append(file);
else
fsDataOutputStream = hadoopFileSystem.create(file);
}
else
fsDataOutputStream = hadoopFileSystem.create(file);
不清楚导致此异常的原因。在阅读各种来源后,我是否很困惑HDFS是否支持追加。让我知道我在这里失踪的是什么
编辑:添加我在datanode' s日志中找到的堆栈跟踪
2015-10-30 16:19:54,435 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1012136337-192.168.123.103-1411103100884:blk_1073742239_1421 src: /127.0.0.1:54160 dest: /127.0.0.1:50010
2015-10-30 16:19:54,435 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Appending to FinalizedReplica, blk_1073742239_1421, FINALIZED
getNumBytes() = 812
getBytesOnDisk() = 812
getVisibleLength()= 812
getVolume() = /Users/niranjan/hadoop/hdfs/datanode/current
getBlockFile() = /Users/niranjan/hadoop/hdfs/datanode/current/BP- 1012136337-192.168.123.103-1411103100884/current/finalized/blk_1073742239
unlinked =false
2015-10-30 16:19:54,461 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for BP-1012136337-192.168.123.103-1411103100884:blk_1073742239_1422
java.io.IOException: Premature EOF from inputStream
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:435)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:693)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:569)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:115)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
at java.lang.Thread.run(Thread.java:745)
答案 0 :(得分:0)
通过搜索,我发现在您的hdfs-site.xml
中添加以下内容可能有帮助。
<property>
<name>dfs.datanode.max.transfer.threads</name>
<value>8192</value>
</property>
答案 1 :(得分:0)
这是Hadoop从版本2.2.0到2.5.1的一个问题,升级到更高版本解决了这个问题,而没有在配置文件中进行任何调整。