Hadoop:将多个IP地址绑定到群集NameNode

时间:2014-08-05 09:17:41

标签: java hadoop network-programming cluster-computing distributed-computing

我在Softlayer上有一个四节点的Hadoop集群。主(NameNode)具有用于外部访问的公共IP地址和用于群集访问的专用IP地址。从节点(datanode)具有私有IP地址,我正在尝试连接到主节点,而无需为每个从节点分配公共IP地址。

我已经意识到将fs.defaultFS设置为NameNode的公共地址允许外部访问,除了NameNode只侦听传入连接的地址,而不是私有地址。所以我在datanode日志中得到ConnectionRefused异常,因为他们试图连接NameNode的私有IP地址。

我认为解决方案可能是将公共和私有IP地址都设置为NameNode,以便保留外部访问并允许我的从属节点连接。

那么有没有办法可以将这两个地址绑定到NameNode,以便它可以同时监听这两个地址?

编辑:Hadoop版本2.4.1。

2 个答案:

答案 0 :(得分:2)

提问者将此问题编辑成他的问题答案:

  

在hdfs-site.xml中,将dfs.namenode.rpc-bind-host的值设置为   0.0.0.0和Hadoop将倾听私人和公众的意见   允许远程访问和数据节点访问的网络接口。

答案 1 :(得分:0)

HDFS Support for Multihomed Networks,并在Cloudera HDFS Support for Multihomed Networks上完成。 Parameters for Multi-Homing for Hortonworks

<property>
  <name>dfs.namenode.rpc-bind-host</name>
  <value>0.0.0.0</value>
  <description>
    The actual address the RPC server will bind to. If this optional address is
    set, it overrides only the hostname portion of dfs.namenode.rpc-address.
    It can also be specified per name node or name service for HA/Federation.
    This is useful for making the name node listen on all interfaces by
    setting it to 0.0.0.0.
  </description>
</property>
  

此外,建议更改dfs.namenode.rpc-bind-host,   dfs.namenode.servicerpc-bind-hostdfs.namenode.http-bind-host和   dfs.namenode.https-bind-host

     

默认将HDFS端点指定为主机名或IP   地址。无论哪种情况,HDFS守护程序都将绑定到单个IP   解决使守护程序无法从其他网络访问的问题。

     

解决方案是为服务器端点强制设置单独的设置   绑定通配符IP地址INADDR_ANY,即0.0.0.0。不供应   具有任何这些设置的端口号。

     

注意:在主机/从机中,优先使用主机名而不是IP地址   配置文件。

<property>
  <name>dfs.namenode.rpc-bind-host</name>
  <value>0.0.0.0</value>
  <description>
    The actual address the RPC server will bind to. If this optional address is
    set, it overrides only the hostname portion of dfs.namenode.rpc-address.
    It can also be specified per name node or name service for HA/Federation.
    This is useful for making the name node listen on all interfaces by
    setting it to 0.0.0.0.
  </description>
</property>

<property>
  <name>dfs.namenode.servicerpc-bind-host</name>
  <value>0.0.0.0</value>
  <description>
    The actual address the service RPC server will bind to. If this optional address is
    set, it overrides only the hostname portion of dfs.namenode.servicerpc-address.
    It can also be specified per name node or name service for HA/Federation.
    This is useful for making the name node listen on all interfaces by
    setting it to 0.0.0.0.
  </description>
</property>

<property>
  <name>dfs.namenode.http-bind-host</name>
  <value>0.0.0.0</value>
  <description>
    The actual adress the HTTP server will bind to. If this optional address
    is set, it overrides only the hostname portion of dfs.namenode.http-address.
    It can also be specified per name node or name service for HA/Federation.
    This is useful for making the name node HTTP server listen on all
    interfaces by setting it to 0.0.0.0.
  </description>
</property>

<property>
  <name>dfs.namenode.https-bind-host</name>
  <value>0.0.0.0</value>
  <description>
    The actual adress the HTTPS server will bind to. If this optional address
    is set, it overrides only the hostname portion of dfs.namenode.https-address.
    It can also be specified per name node or name service for HA/Federation.
    This is useful for making the name node HTTPS server listen on all
    interfaces by setting it to 0.0.0.0.
  </description>
</property>
  

注意:在开始修改之前,将代理和服务器停止为   遵循:

     
      
  1. 服务cloudera-scm-agent停止
  2.   
  3. 服务cloudera-scm-server停止
  4.   
     

如果您的集群配置了主要和次要NameNode   而不是需要在两个节点上进行此修改。修改   通过server and agent stopped

完成      

完成并保存hdfs-site.xml文件后,开始    NameNodes 上的服务器和代理,以及    DataNodes (不会   使用以下方法也会损害集群):

     
      
  1. 服务cloudera-scm-agent启动
  2.   
  3. 服务cloudera-scm-server启动
  4.   

可以为IBM BigInsights实施相同的解决方案:

    To configure HDFS to bind to all the interfaces , add the following configuration variable using Ambari under the section HDFS
-> Configs ->Advanced -> Custom hdfs-site


    dfs.namenode.rpc-bind-host = 0.0.0.0

    Restart HDFS to apply the configuration change . 

    Verify if port 8020 is bound and listening to requests from all the interfaces using the following command. 

    netstat -anp|grep 8020
    tcp 0 0 0.0.0.0:8020 0.0.0.0:* LISTEN 15826/java

IBM BigInsights: How to configure Hadoop client port 8020 to bind to all the network interfaces?

在Cloudera中,HDFS配置中有一个名为

的属性

Cloudera 中的HDFS configuration中有一个名为Bind NameNode to Wildcard Address的属性,只需要选中该框,它将绑定服务到0.0.0.0

  

then restart hdfs service

 On the Home > Status tab, click  to the right of the service
 name and select Restart. Click Start on the next screen to confirm.
 When you see a Finished status, the service has restarted.

Starting, Stopping, Refreshing, and Restarting a Cluster Starting, Stopping, and Restarting Services