Question

我从hadoop操作中读到，如果datanode在写入过程中失败，

包含剩余数据节点的新复制管道打开，写回复。在这一点上，事情大多回到了正常，写操作继续，直到文件关闭。该 namenode会注意到文件中的一个块是未完成复制并将安排创建新副本异步。客户端可以从多个失败的datanode中恢复提供至少最少数量的副本（由默认，这是一个）。

但是如果所有数据节点都失败了怎么办？即，最小数量的副本不写？客户会请求namenode提供新的datanode列表吗？或者工作会失败吗？

注意：我的问题不是当群集中的所有数据节点都出现故障时会发生什么。问题是如果客户端应该写入的所有数据节点在写入操作期间失败，会发生什么

假设namenode告诉客户端将BLOCK B1写入Rack1中的datanodes D1，Rack2中的D2和Rack1中的D3。群集中可能还有其他机架（Rack 4,5,6，...）。如果Rack1和2在写入过程中失败，客户端知道数据没有成功写入，因为它没有收到来自数据节点的ACK，此时，它会要求Namenode给出一组新的数据节点吗？可能在仍然活着的Racks？

Answer 1

好的，我得到了你的要求。 DFSClient将从namenode获取一个数据节点列表，它应该写一个文件的块（比如说A）。 DFSClient将迭代该Datanode列表并在这些位置写入块A.如果第一个数据节点中的块写入失败，它将放弃块写入并向namenode询问一组新的datanode，它可以尝试再次写入。

这里是来自DFSClient的示例代码，解释了 -

private DatanodeInfo[] nextBlockOutputStream(String client) throws IOException {
    //----- other code ------
    do {
            hasError = false;
            lastException = null;
            errorIndex = 0;
            retry = false;
            nodes = null;
            success = false;

            long startTime = System.currentTimeMillis();
            lb = locateFollowingBlock(startTime);
            block = lb.getBlock();
            accessToken = lb.getBlockToken();
            nodes = lb.getLocations();

            //
            // Connect to first DataNode in the list.
            //
            success = createBlockOutputStream(nodes, clientName, false);

            if (!success) {
              LOG.info("Abandoning block " + block);
              namenode.abandonBlock(block, src, clientName);

              // Connection failed.  Let's wait a little bit and retry
              retry = true;
              try {
                if (System.currentTimeMillis() - startTime > 5000) {
                  LOG.info("Waiting to find target node: " + nodes[0].getName());
                }
                Thread.sleep(6000);
              } catch (InterruptedException iex) {
              }
            }
          } while (retry && --count >= 0);

          if (!success) {
            throw new IOException("Unable to create new block.");
          }
     return nodes;
}

当所有数据节点在hadoop中失败时会发生什么？

1 个答案: