从Hadoop的手动生成(硬编码)序列文件读取时是否出现ChecksumException?

时间:2017-03-17 06:33:28

标签: java hadoop hadoop2 checksum sequencefile

我编写了一个代码来将文件保存到Hadoop的序列文件中。关键是文件名,值是文件的字节数组。输出是序列文件和.crc文件

之后我尝试从序列文件中读取,但是我得到了一个关于Checksum的例外:

Exception in thread "main" org.apache.hadoop.fs.ChecksumException: Checksum error: file:/home/mosab/Desktop/output/ProcessWS/sequence.seq at 18873344
    at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:259)
    at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:276)
    at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:228)
    at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:196)
    at java.io.DataInputStream.readFully(DataInputStream.java:195)
    at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:70)
    at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:120)
    at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2436)
    at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2335)
    at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2381)
    at sequence.extractor.Extractor.main(Extractor.java:36)

我试图删除.crc文件并再次读取序列文件,但后来我得到了EOFException

有任何解决方案吗?

1 个答案:

答案 0 :(得分:0)

解决方案:ChecksumException出现是因为我在写完/追加后忘记关闭序列编写器。这会导致序列文件与其crc文件不匹配。