LeoFS返回nodedown错误,但该节点似乎正在运行

时间:2017-10-09 12:01:35

标签: s3cmd

我在一个节点上运行leofs 1.2.22。一切都工作正常但是当我今天开始学习时,我无法列出任何桶的内容。我收到错误,说明节点已关闭。

leofs-adm状态显示

$ leofs-adm status
[System Confiuration]
-----------------------------------+----------
 Item                              | Value    
-----------------------------------+----------
 Basic/Consistency level
-----------------------------------+----------
                    system version | 1.2.22
                        cluster Id | leofs_1
                             DC Id | dc_1
                    Total replicas | 1
          number of successes of R | 1
          number of successes of W | 1
          number of successes of D | 1
 number of rack-awareness replicas | 0
                         ring size | 2^128
-----------------------------------+----------
 Multi DC replication settings
-----------------------------------+----------
        max number of joinable DCs | 2
           number of replicas a DC | 1
-----------------------------------+----------
 Manager RING hash
-----------------------------------+----------
                 current ring-hash | 433fe365
                previous ring-hash | 433fe365
-----------------------------------+----------

 [State of Node(s)]
  -------+--------------------------+--------------+----------------+-----------  -----+----------------------------
 type  |           node           |    state     |  current ring  |   prev     ring    |          updated at         
-------+--------------------------+--------------+----------------+----------------+----------------------------
  S    | storage_0@127.0.0.1      | running      | 433fe365       | 433fe365       | 2017-06-27 01:00:50 -0400
  G    | gateway_0@127.0.0.1      | running      | 433fe365       | 433fe365       | 2017-10-09 06:49:48 -0400
-------+--------------------------+--------------+----------------+----------------+----------------------------

这表明存储节点正在运行。但是,如果检查存储节点的详细信息,则返回

$ leofs-adm du storage_0@127.0.0.1
[ERROR] nodedown

我尝试恢复节点,但这也失败了

 $ leofs-adm recover-node storage_0@127.0.0.1 
 [ERROR] Could not connect

我可以列出存储桶

$ leofs-adm get-buckets
cluster id   | bucket                  | owner  | permissions                             | created at                
-------------+-------------------------+--------+----------------------------------------+---------------------------
leofs_1      | workflow            | simon  | Me(full_control)                           | 2017-06-28 20:47:08 -0400
leofs_1      | weather             | simon  | Me(full_control)                       | 2017-06-26 08:27:26 -0400
leofs_1      | workers             | simon  | Me(full_control), Everyone(read,write) | 2017-06-26 08:30:30 -0400

但我无法列出任何存储桶的内容

$ s3cmd ls s3://weather/
WARNING: Retrying failed request: /?delimiter=/
WARNING: 500 (InternalError): We encountered an internal error. Please try again.
WARNING: Waiting 3 sec...
WARNING: Retrying failed request: /?delimiter=/
WARNING: 500 (InternalError): We encountered an internal error. Please try again.

我不知道如何恢复节点,也没有找到任何在线帮助。更新leofs的版本不是一个选项,因为我无法让Python boto2与更高版本的LeoFS进行通信。

此致

西蒙

0 个答案:

没有答案