自安装SAP HANA Spark Connector以来,我在基于云的Hadoop集群(HDP 2.3)方面遇到了重大问题。损坏的块导致NameNode始终打开Safemode。
hdfs fsck给了我以下信息:
[User@node-aa71f18bd ~]$ FSCK started by hdfs (auth:SIMPLE) from /10.97.20.236 for path / at Wed Nov 18 1 3:49:31 UTC 2015
-bash: syntax error near unexpected token `('
[User@node-aa71f18bd ~]$ .
-bash: .: filename argument required
.: usage: . filename [arguments]
[User@node-aa71f18bd ~]$ /amshbase/data/default/METRIC_AGGREGATE/.tabledesc/.tableinfo.0000000001: CORRUP T blockpool BP-1656641573-10.97.31.53-1446206026510 block blk_1073741852
-bash: /amshbase/data/default/METRIC_AGGREGATE/.tabledesc/.tableinfo.0000000001:: No such file or directory
[User@node-aa71f18bd ~]$
[User@node-aa71f18bd ~]$ /amshbase/data/default/METRIC_AGGREGATE/.tabledesc/.tableinfo.0000000001: MISSIN G 1 blocks of total size 911 B....................
-bash: /amshbase/data/default/METRIC_AGGREGATE/.tabledesc/.tableinfo.0000000001:: No such file or directory
[User@node-aa71f18bd ~]$ /amshbase/data/default/METRIC_AGGREGATE_DAILY/6e87af3b3351ba6f55092465a59053b8/. regioninfo: CORRUPT blockpool BP-1656641573-10.97.31.53-1446206026510 block blk_ 1073741857
-bash: /amshbase/data/default/METRIC_AGGREGATE_DAILY/6e87af3b3351ba6f55092465a59053b8/.: No such file or directory
[User@node-aa71f18bd ~]$
[User@node-aa71f18bd ~]$ /amshbase/data/default/METRIC_AGGREGATE_DAILY/6e87af3b3351ba6f55092465a59053b8/. regioninfo: MISSING 1 blocks of total size 57 B................................. ........
-bash: /amshbase/data/default/METRIC_AGGREGATE_DAILY/6e87af3b3351ba6f55092465a59053b8/.: No such file or directory
[User@node-aa71f18bd ~]$ /amshbase/data/default/SYSTEM.CATALOG/167feb5a405a77b26fcaea5d560c84b1/.regionin fo: CORRUPT blockpool BP-1656641573-10.97.31.53-1446206026510 block blk_10737418 37
-bash: /amshbase/data/default/SYSTEM.CATALOG/167feb5a405a77b26fcaea5d560c84b1/.regionin: No such file or directory
[User@node-aa71f18bd ~]$
[User@node-aa71f18bd ~]$ /amshbase/data/default/SYSTEM.CATALOG/167feb5a405a77b26fcaea5d560c84b1/.regionin fo: MISSING 1 blocks of total size 49 B..
-bash: /amshbase/data/default/SYSTEM.CATALOG/167feb5a405a77b26fcaea5d560c84b1/.regionin: No such file or directory
[User@node-aa71f18bd ~]$ /amshbase/data/default/SYSTEM.CATALOG/167feb5a405a77b26fcaea5d560c84b1/0/b6a59d0 53baa46b1875e6506d01ebd12: CORRUPT blockpool BP-1656641573-10.97.31.53-144620602 6510 block blk_1073741922
-bash: /amshbase/data/default/SYSTEM.CATALOG/167feb5a405a77b26fcaea5d560c84b1/0/b6a59d0: No such file or directory
[User@node-aa71f18bd ~]$
[User@node-aa71f18bd ~]$ /amshbase/data/default/SYSTEM.CATALOG/167feb5a405a77b26fcaea5d560c84b1/0/b6a59d0 53baa46b1875e6506d01ebd12: MISSING 1 blocks of total size 40519 B........
-bash: /amshbase/data/default/SYSTEM.CATALOG/167feb5a405a77b26fcaea5d560c84b1/0/b6a59d0: No such file or directory
[User@node-aa71f18bd ~]$ /amshbase/data/default/SYSTEM.STATS/.tabledesc/.tableinfo.0000000001: CORRUPT bl ockpool BP-1656641573-10.97.31.53-1446206026510 block blk_1073741842
-bash: /amshbase/data/default/SYSTEM.STATS/.tabledesc/.tableinfo.0000000001:: No such file or directory
[User@node-aa71f18bd ~]$
[User@node-aa71f18bd ~]$ /amshbase/data/default/SYSTEM.STATS/.tabledesc/.tableinfo.0000000001: MISSING 1 blocks of total size 838 B......................
-bash: /amshbase/data/default/SYSTEM.STATS/.tabledesc/.tableinfo.0000000001:: No such file or directory
[User@node-aa71f18bd ~]$ /app-logs/ambari-qa/logs/application_1446206072803_0002/node-a09295f36.Domain _45454: CORRUPT blockpool BP-1656641573-10.97.31.53-1446206026510 block blk_1073 741887
-bash: /app-logs/ambari-qa/logs/application_1446206072803_0002/node-a09295f36.Domain: No such file or directory
[User@node-aa71f18bd ~]$
[User@node-aa71f18bd ~]$ /app-logs/ambari-qa/logs/application_1446206072803_0002/node-a09295f36.Domain _45454: MISSING 1 blocks of total size 11733 B......
-bash: /app-logs/ambari-qa/logs/application_1446206072803_0002/node-a09295f36.Domain: No such file or directory
[User@node-aa71f18bd ~]$ /app-logs/ambari-qa/logs/application_1446206072803_0005/node-60c160a97.Domain _45454: CORRUPT blockpool BP-1656641573-10.97.31.53-1446206026510 block blk_1073 741912
-bash: /app-logs/ambari-qa/logs/application_1446206072803_0005/node-60c160a97.Domain: No such file or directory
[User@node-aa71f18bd ~]$
[User@node-aa71f18bd ~]$ /app-logs/ambari-qa/logs/application_1446206072803_0005/node-60c160a97.Domain _45454: MISSING 1 blocks of total size 6691 B.......
-bash: /app-logs/ambari-qa/logs/application_1446206072803_0005/node-60c160a97.Domain: No such file or directory
[User@node-aa71f18bd ~]$ .............
bash: .............: command not found...
[User@node-aa71f18bd ~]$ /hdp/apps/2.3.2.0-2950/tez/tez.tar.gz: CORRUPT blockpool BP-1656641573-10.97.31. 53-1446206026510 block blk_1073741827
-bash: /hdp/apps/2.3.2.0-2950/tez/tez.tar.gz:: No such file or directory
[User@node-aa71f18bd ~]$
[User@node-aa71f18bd ~]$ /hdp/apps/2.3.2.0-2950/tez/tez.tar.gz: MISSING 1 blocks of total size 56926645 B ..................................
-bash: /hdp/apps/2.3.2.0-2950/tez/tez.tar.gz:: No such file or directory
[User@node-aa71f18bd ~]$ /user/ambari-qa/DistributedShell/application_1446206072803_0004/AppMaster.jar: C ORRUPT blockpool BP-1656641573-10.97.31.53-1446206026510 block blk_1073741897
-bash: /user/ambari-qa/DistributedShell/application_1446206072803_0004/AppMaster.jar:: No such file or directory
[User@node-aa71f18bd ~]$
[User@node-aa71f18bd ~]$ /user/ambari-qa/DistributedShell/application_1446206072803_0004/AppMaster.jar: M ISSING 1 blocks of total size 46057 B...........Status: CORRUPT
-bash: /user/ambari-qa/DistributedShell/application_1446206072803_0004/AppMaster.jar:: No such file or directory
[User@node-aa71f18bd ~]$ Total size: 2217611677 B (Total open files size: 166 B)
-bash: syntax error near unexpected token `('
[User@node-aa71f18bd ~]$ Total dirs: 188
bash: Total: command not found...
[User@node-aa71f18bd ~]$ Total files: 156
bash: Total: command not found...
[User@node-aa71f18bd ~]$ Total symlinks: 0 (Files currently being written: 4)
-bash: syntax error near unexpected token `('
[User@node-aa71f18bd ~]$ Total blocks (validated): 133 (avg. block size 16673772 B) (Total open fil e blocks (not validated): 4)
-bash: syntax error near unexpected token `('
[User@node-aa71f18bd ~]$ ********************************
bash: ********************************: command not found...
[User@node-aa71f18bd ~]$ UNDER MIN REPL'D BLOCKS: 9 (6.766917 %)
> dfs.namenode.replication.min: 1
> CORRUPT FILES: 9
> MISSING BLOCKS: 9
> MISSING SIZE: 57033500 B
> CORRUPT BLOCKS: 9
> ********************************
> Minimally replicated blocks: 124 (93.233086 %)
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 0 (0.0 %)
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 3
> Average block replication: 2.7969925
> Corrupt blocks: 9
> Missing replicas: 0 (0.0 %)
> Number of data-nodes: 3
> Number of racks: 1
> FSCK ended at Wed Nov 18 13:49:31 UTC 2015 in 29 milliseconds
>
>
> The filesystem under path '/' is CORRUPT
问题是,没有"数据"在群集上。有些部分似乎是日志文件 - 但我不确定,如果我要删除所需的系统文件(例如AppMaster.jar)。如何在不重新设置整个系统的情况下至少恢复重要文件?
感谢您的帮助, 的Sascha
答案 0 :(得分:0)
用于在云环境中设置群集节点的Chef脚本将VM的存储设置为DataNode的主存储卷。所以hdfs耗尽了存储空间。但仅限于三个附加卷中的一个。问题与Hana Spark Connector无关,特别是任何其他文件迟早会引起同样的问题。