我在EC2实例上运行的HDFS中有大约1GB的数据。我试图在EC2实例中运行Sqoop脚本,可能是由于内存不足,Sqoop无法正常工作,并且出现错误消息说内存较少,因为其他某个进程在后台运行,而我的一位朋友发起了该进程。我没有意识到这一点,而是运行了sqoop脚本。不久,我发现我的Hdfs数据已损坏。我只有一个节点。我的数据非常重要,我无法找到方法。我运行了hdfs fsck
,它显示
Total size: 1674241949 B
Total dirs: 128
Total files: 879
Total symlinks: 0
Total blocks (validated): 765 (avg. block size 2188551 B)
********************************
UNDER MIN REPL'D BLOCKS: 562 (73.46405 %)
dfs.namenode.replication.min: 1
CORRUPT FILES: 555
MISSING BLOCKS: 562
MISSING SIZE: 1606350873 B
CORRUPT BLOCKS: 562
********************************
Minimally replicated blocks: 203 (26.535948 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 203 (26.535948 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 0.2653595
Corrupt blocks: 562
Missing replicas: 665 (26.037588 %)
Number of data-nodes: 1
Number of racks: 1
FSCK ended at Fri Jul 06 11:57:29 UTC 2018 in 164 milliseconds
如果我运行hadoop fsck -list-corruptfileblocks
,我将得到The filesystem under path '/' has 0 CORRUPT files
另外,当我尝试获取数据时,它说文件已损坏,无法访问它。我也尝试了./bin/hadoop namenode -recover
,但徒劳无功。还有其他可能尽快取回数据吗?