我编写了一个代码来将文件保存到Hadoop的序列文件中。关键是文件名,值是文件的字节数组。输出是序列文件和.crc文件
之后我尝试从序列文件中读取,但是我得到了一个关于Checksum的例外:
Exception in thread "main" org.apache.hadoop.fs.ChecksumException: Checksum error: file:/home/mosab/Desktop/output/ProcessWS/sequence.seq at 18873344
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:259)
at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:276)
at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:228)
at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:196)
at java.io.DataInputStream.readFully(DataInputStream.java:195)
at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:70)
at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:120)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2436)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2335)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2381)
at sequence.extractor.Extractor.main(Extractor.java:36)
我试图删除.crc文件并再次读取序列文件,但后来我得到了EOFException
有任何解决方案吗?
答案 0 :(得分:0)
解决方案:ChecksumException出现是因为我在写完/追加后忘记关闭序列编写器。这会导致序列文件与其crc文件不匹配。