我正在运行具有3个数据节点的hadoop-2.6.0-cdh5.15.0版本的HDFS。
很多时候,我们发现hdfs dfadmin -report
工具报告的DFS剩余量远远低于存储HDFS数据的分区上的实际可用磁盘空间。
我进行了更多研究,并怀疑这可能是由于hdfs中的这些已知问题或更多原因引起的。
https://issues.apache.org/jira/browse/HDFS-8072
https://issues.apache.org/jira/browse/HDFS-9038
https://issues.apache.org/jira/browse/HDFS-9530
问题不仅在于报告不正确,而且一旦DFS剩余量过低(即使有足够的可用磁盘空间),它也会使datanode无法用于新块写入。
防止这种情况发生的唯一方法是定期重新启动数据节点,以使DFS Remaining
重新发送为其正确值。
但是有时候我发现DFS Remaining
的值变得非常低(〜4小时),这意味着我需要每4小时重新启动一次数据节点。
这是解决此问题的唯一可能的解决方法,还是我可以调整一些设置以防止或减少该问题的影响。
实际磁盘使用情况
--platform1--
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg-var 1.6T 802G 715G 53% /var
--platform2--
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg-var 1.6T 295G 1.3T 20% /var
--platform3--
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg-var 1.6T 317G 1.2T 21% /var
dfadmin -report
报告的磁盘剩余量
$ sudo -u hdfs hdfs dfsadmin -report
Configured Capacity: 5160275083264 (4.69 TB)
Present Capacity: 3430912695157 (3.12 TB)
DFS Remaining: 3122322567284 (2.84 TB)
DFS Used: 308590127873 (287.40 GB)
DFS Used%: 8.99%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (3):
Name: x.x.x.x:50010 (platform2)
Hostname: platform2
Decommission Status : Normal
Configured Capacity: 1728548233216 (1.57 TB)
DFS Used: 106949161335 (99.60 GB)
Non DFS Used: 208876412553 (194.53 GB)
DFS Remaining: 1237362796874 (1.13 TB)
DFS Used%: 6.19%
DFS Remaining%: 71.58%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1488
Last contact: Sun Dec 02 07:42:34 UTC 2018
Name: x.x.x.x:50010 (platform3)
Hostname: platform3
Decommission Status : Normal
Configured Capacity: 1728548233216 (1.57 TB)
DFS Used: 104201662374 (97.05 GB)
Non DFS Used: 237217333338 (220.93 GB)
DFS Remaining: 1217030585122 (1.11 TB)
DFS Used%: 6.03%
DFS Remaining%: 70.41%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1395
Last contact: Sun Dec 02 07:42:34 UTC 2018
Name: x.x.x.x:50010 (platform1)
Hostname: platform1
Decommission Status : Normal
Configured Capacity: 1703178616832 (1.55 TB)
DFS Used: 97439304164 (90.75 GB)
Non DFS Used: 763141211676 (710.73 GB)
DFS Remaining: 667929185288 (622.06 GB)
DFS Used%: 5.72%
DFS Remaining%: 39.22%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1478
Last contact: Sun Dec 02 07:42:34 UTC 2018