节点管理器无法在Hadoop 2.6.0中启动(连接被拒绝)

时间:2015-09-08 10:09:08

标签: linux hadoop amazon-ec2 client-server connection-refused

我在EC2实例中安装了hadoop 2.6.0多节点集群(ubuntu 14.04 64位)。 master中的所有恶魔(NameNode,SecondaryNameNode,ResourceManager)都已启动,但是在slave机器中只有DataNode启动NodeManager因连接拒绝而关闭。

请在这方面帮助我。提前致谢

我的NodeManager的日志文件如下:

2015-09-08 07:59:36,606 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: NodeManager configured with 8 G physical memory allocated to containers, which is more than 80% of the total physical memory available (992.5 M). Thrashing might happen.
2015-09-08 07:59:36,613 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Initialized nodemanager for null: physical-memory=8192 virtual-memory=17204 virtual-cores=8
2015-09-08 07:59:36,646 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2015-09-08 07:59:36,666 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 53949
2015-09-08 07:59:36,688 INFO org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.yarn.api.ContainerManagementProtocolPB to the server
2015-09-08 07:59:36,688 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Blocking new container-requests as container manager rpc server is still starting.
2015-09-08 07:59:36,691 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2015-09-08 07:59:36,692 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 53949: starting
2015-09-08 07:59:36,707 INFO org.apache.hadoop.yarn.server.nodemanager.security.NMContainerTokenSecretManager: Updating node address : ec2-52-88-167-9.us-west-2.compute.amazonaws.com:53949
2015-09-08 07:59:36,713 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2015-09-08 07:59:36,713 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8040
2015-09-08 07:59:36,716 INFO org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.yarn.server.nodemanager.api.LocalizationProtocolPB to the server
2015-09-08 07:59:36,717 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2015-09-08 07:59:36,717 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8040: starting
2015-09-08 07:59:36,717 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Localizer started on port 8040
2015-09-08 07:59:36,719 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: ContainerManager started at ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154:53949
2015-09-08 07:59:36,719 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: ContainerManager bound to 0.0.0.0/0.0.0.0:0
2015-09-08 07:59:36,719 INFO org.apache.hadoop.yarn.server.nodemanager.webapp.WebServer: Instantiating NMWebApp at 0.0.0.0:8042
2015-09-08 07:59:36,790 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2015-09-08 07:59:36,793 INFO org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.nodemanager is not defined
2015-09-08 07:59:36,805 INFO org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2015-09-08 07:59:36,806 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context node
2015-09-08 07:59:36,806 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2015-09-08 07:59:36,807 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2015-09-08 07:59:36,812 INFO org.apache.hadoop.http.HttpServer2: adding path spec: /node/*
2015-09-08 07:59:36,812 INFO org.apache.hadoop.http.HttpServer2: adding path spec: /ws/*
2015-09-08 07:59:36,820 INFO org.apache.hadoop.http.HttpServer2: Jetty bound to port 8042
2015-09-08 07:59:36,820 INFO org.mortbay.log: jetty-6.1.26
2015-09-08 07:59:36,863 INFO org.mortbay.log: Extract jar:file:/home/ubuntu/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-common-2.6.0.jar!/webapps/node to /tmp/Jetty_0_0_0_0_8042_node____19tj0x/webapp
2015-09-08 07:59:37,358 INFO org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:8042
2015-09-08 07:59:37,359 INFO org.apache.hadoop.yarn.webapp.WebApps: Web app /node started at 8042
2015-09-08 07:59:37,879 INFO org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules
2015-09-08 07:59:37,885 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8031
2015-09-08 07:59:37,913 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out 0 NM container statuses: []
2015-09-08 07:59:37,917 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Registering with RM using containers :[]
**2015-09-08 07:59:38,951 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-09-08 07:59:39,956 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-09-08 07:59:40,957 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-09-08 07:59:41,957 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-09-08 07:59:42,958 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)**

2015-09-08 08:19:48,256 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl failed in state STARTED; cause: **org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused**
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:197)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:264)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:463)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:509)
Caused by: java.net.ConnectException: Call From ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
        at sun.reflect.GeneratedConstructorAccessor8.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
        at org.apache.hadoop.ipc.Client.call(Client.java:1472)
        at org.apache.hadoop.ipc.Client.call(Client.java:1399)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at com.sun.proxy.$Proxy27.registerNodeManager(Unknown Source)
        at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:68)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy28.registerNodeManager(Unknown Source)
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:257)
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:191)
        ... 6 more
Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
        at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
        at org.apache.hadoop.ipc.Client.call(Client.java:1438)
        ... 18 more
2015-09-08 08:19:48,257 INFO org.apache.hadoop.service.AbstractService: Service NodeManager failed in state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:197)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:264)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:463)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:509)
Caused by: java.net.ConnectException: Call From ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
        at sun.reflect.GeneratedConstructorAccessor8.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
        at org.apache.hadoop.ipc.Client.call(Client.java:1472)
        at org.apache.hadoop.ipc.Client.call(Client.java:1399)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at com.sun.proxy.$Proxy27.registerNodeManager(Unknown Source)
        at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:68)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy28.registerNodeManager(Unknown Source)
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:257)
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:191)
        ... 6 more
Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
        at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
        at org.apache.hadoop.ipc.Client.call(Client.java:1438)
        ... 18 more
2015-09-08 08:19:48,263 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:8042
2015-09-08 08:19:48,264 INFO org.apache.hadoop.ipc.Server: Stopping server on 53949
2015-09-08 08:19:48,266 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 53949
2015-09-08 08:19:48,267 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2015-09-08 08:19:48,267 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl is interrupted. Exiting.
2015-09-08 08:19:48,267 INFO org.apache.hadoop.ipc.Server: Stopping server on 8040
2015-09-08 08:19:48,268 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8040
2015-09-08 08:19:48,268 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2015-09-08 08:19:48,269 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Public cache exiting
2015-09-08 08:19:48,269 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NodeManager metrics system...
2015-09-08 08:19:48,270 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system stopped.
2015-09-08 08:19:48,270 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system shutdown complete.
2015-09-08 08:19:48,270 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager

核心-site.xml中:

<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://ec2-52-26-161-203.us-west-2.compute.amazonaws.com:8020</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/ubuntu/hdfstmp</value>
  </property>
</configuration>

mapred-site.xml中:

   <configuration>
      <property>
        <name>mapred.job.tracker</name>
        <value>hdfs://ec2-52-26-161-203.us-west-2.compute.amazonaws.com:8021</value>
      </property>
    </configuration>

HD​​FS-site.xml中:

 <configuration>
    <property>
      <name>dfs.replication</name>
      <value>2</value>
    </property>
    <property>
      <name>dfs.permissions</name>
      <value>false</value>
    </property>
 </configuration>

主机:

ubuntu @ ec2-52-26-161-203:〜$ vim / etc / hosts

172.31.23.167 ec2-52-26-161-203.us-west-2.compute.amazonaws.com

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

ubuntu @ ec2-52-26-161-203:〜$ vim / etc / hadoop / masters

ec2-52-26-161-203.us-west-2.compute.amazonaws.com

ubuntu @ ec2-52-26-161-203:〜$ vim / etc / hadoop / slaves

ec2-52-88-167-9.us-west-2.compute.amazonaws.com

奴隶机:

ubuntu @ ec2-52-88-167-9:〜 vim / etc / hosts

172.31.29.154 ec2-52-88-167-9.us-west-2.compute.amazonaws.com

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

ubuntu @ ec2-52-88-167-9:〜 vim / etc / hadoop / slaves

ec2-52-88-167-9.us-west-2.compute.amazonaws.com

ubuntu @ ec2-52-26-161-203:〜$ sudo netstat -lpten | grep java

tcp        0      0 0.0.0.0:50070           0.0.0.0:*               LISTEN      1000       569904      19910/java      
tcp        0      0 0.0.0.0:50090           0.0.0.0:*               LISTEN      1000       570916      20136/java      
tcp        0      0 172.31.23.167:8020      0.0.0.0:*               LISTEN      1000       569911      19910/java      
tcp6       0      0 :::8088                 :::*                    LISTEN      1000       571699      20278/java      
tcp6       0      0 :::8030                 :::*                    LISTEN      1000       571690      20278/java      
tcp6       0      0 :::8031                 :::*                    LISTEN      1000       571683      20278/java      
tcp6       0      0 :::8032                 :::*                    LISTEN      1000       571695      20278/java      
tcp6       0      0 :::8033                 :::*                    LISTEN      1000       571702      20278/java 

Telnet命令:

ubuntu @ ec2-52-26-161-203:〜$ telnet localhost 8031

Trying ::1...
Connected to localhost.
Escape character is '^]'.

资源管理器如何使用8031端口?我没有提供上面的hadoop配置文件(coresite.xml,mapred-site.xml,hdfs-site.xml)。

2 个答案:

答案 0 :(得分:4)

我已经在mapred-site.xml和yarn-site.xml中进行了修改,这解决了我的问题。因为我没有在yarn-site.xml中提到资源管理器的主机名属性值,所以它试图连接地址0.0.0.0,这是连接拒绝异常的原因。

<强> mapred-site.xml中

<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
</property>

<强>纱-site.xml中

<property>
    <name>yarn.resourcemanager.hostname</name>
    <value>ec2-52-26-161-203.us-west-2.compute.amazonaws.com</value>
</property>

答案 1 :(得分:1)

hadoop文档http://wiki.apache.org/hadoop/ConnectionRefused 清楚地说:

  

确保异常中的目标地址不是0.0.0.0 -this   意味着您实际上并未使用真实配置客户端   那个地址

你能否尝试在slave机器的主机和slave的ip entry中添加master ip entry to master。如果不需要,还要注释掉hosts文件中的所有条目。