我在EC2实例中安装了hadoop 2.6.0多节点集群(ubuntu 14.04 64位)。 master中的所有恶魔(NameNode,SecondaryNameNode,ResourceManager)都已启动,但是在slave机器中只有DataNode启动NodeManager因连接拒绝而关闭。
请在这方面帮助我。提前致谢
我的NodeManager的日志文件如下:
2015-09-08 07:59:36,606 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: NodeManager configured with 8 G physical memory allocated to containers, which is more than 80% of the total physical memory available (992.5 M). Thrashing might happen.
2015-09-08 07:59:36,613 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Initialized nodemanager for null: physical-memory=8192 virtual-memory=17204 virtual-cores=8
2015-09-08 07:59:36,646 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2015-09-08 07:59:36,666 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 53949
2015-09-08 07:59:36,688 INFO org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.yarn.api.ContainerManagementProtocolPB to the server
2015-09-08 07:59:36,688 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Blocking new container-requests as container manager rpc server is still starting.
2015-09-08 07:59:36,691 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2015-09-08 07:59:36,692 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 53949: starting
2015-09-08 07:59:36,707 INFO org.apache.hadoop.yarn.server.nodemanager.security.NMContainerTokenSecretManager: Updating node address : ec2-52-88-167-9.us-west-2.compute.amazonaws.com:53949
2015-09-08 07:59:36,713 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2015-09-08 07:59:36,713 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8040
2015-09-08 07:59:36,716 INFO org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.yarn.server.nodemanager.api.LocalizationProtocolPB to the server
2015-09-08 07:59:36,717 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2015-09-08 07:59:36,717 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8040: starting
2015-09-08 07:59:36,717 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Localizer started on port 8040
2015-09-08 07:59:36,719 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: ContainerManager started at ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154:53949
2015-09-08 07:59:36,719 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: ContainerManager bound to 0.0.0.0/0.0.0.0:0
2015-09-08 07:59:36,719 INFO org.apache.hadoop.yarn.server.nodemanager.webapp.WebServer: Instantiating NMWebApp at 0.0.0.0:8042
2015-09-08 07:59:36,790 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2015-09-08 07:59:36,793 INFO org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.nodemanager is not defined
2015-09-08 07:59:36,805 INFO org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2015-09-08 07:59:36,806 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context node
2015-09-08 07:59:36,806 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2015-09-08 07:59:36,807 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2015-09-08 07:59:36,812 INFO org.apache.hadoop.http.HttpServer2: adding path spec: /node/*
2015-09-08 07:59:36,812 INFO org.apache.hadoop.http.HttpServer2: adding path spec: /ws/*
2015-09-08 07:59:36,820 INFO org.apache.hadoop.http.HttpServer2: Jetty bound to port 8042
2015-09-08 07:59:36,820 INFO org.mortbay.log: jetty-6.1.26
2015-09-08 07:59:36,863 INFO org.mortbay.log: Extract jar:file:/home/ubuntu/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-common-2.6.0.jar!/webapps/node to /tmp/Jetty_0_0_0_0_8042_node____19tj0x/webapp
2015-09-08 07:59:37,358 INFO org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:8042
2015-09-08 07:59:37,359 INFO org.apache.hadoop.yarn.webapp.WebApps: Web app /node started at 8042
2015-09-08 07:59:37,879 INFO org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules
2015-09-08 07:59:37,885 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8031
2015-09-08 07:59:37,913 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out 0 NM container statuses: []
2015-09-08 07:59:37,917 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Registering with RM using containers :[]
**2015-09-08 07:59:38,951 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-09-08 07:59:39,956 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-09-08 07:59:40,957 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-09-08 07:59:41,957 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-09-08 07:59:42,958 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)**
2015-09-08 08:19:48,256 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl failed in state STARTED; cause: **org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused**
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:197)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:264)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:463)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:509)
Caused by: java.net.ConnectException: Call From ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.GeneratedConstructorAccessor8.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy27.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:68)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy28.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:257)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:191)
... 6 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
... 18 more
2015-09-08 08:19:48,257 INFO org.apache.hadoop.service.AbstractService: Service NodeManager failed in state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:197)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:264)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:463)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:509)
Caused by: java.net.ConnectException: Call From ec2-52-88-167-9.us-west-2.compute.amazonaws.com/172.31.29.154 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.GeneratedConstructorAccessor8.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy27.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:68)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy28.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:257)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:191)
... 6 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
... 18 more
2015-09-08 08:19:48,263 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:8042
2015-09-08 08:19:48,264 INFO org.apache.hadoop.ipc.Server: Stopping server on 53949
2015-09-08 08:19:48,266 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 53949
2015-09-08 08:19:48,267 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2015-09-08 08:19:48,267 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl is interrupted. Exiting.
2015-09-08 08:19:48,267 INFO org.apache.hadoop.ipc.Server: Stopping server on 8040
2015-09-08 08:19:48,268 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8040
2015-09-08 08:19:48,268 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2015-09-08 08:19:48,269 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Public cache exiting
2015-09-08 08:19:48,269 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NodeManager metrics system...
2015-09-08 08:19:48,270 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system stopped.
2015-09-08 08:19:48,270 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system shutdown complete.
2015-09-08 08:19:48,270 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
核心-site.xml中:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://ec2-52-26-161-203.us-west-2.compute.amazonaws.com:8020</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/ubuntu/hdfstmp</value>
</property>
</configuration>
mapred-site.xml中:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hdfs://ec2-52-26-161-203.us-west-2.compute.amazonaws.com:8021</value>
</property>
</configuration>
HDFS-site.xml中:
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
主机:
ubuntu @ ec2-52-26-161-203:〜$ vim / etc / hosts
172.31.23.167 ec2-52-26-161-203.us-west-2.compute.amazonaws.com
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
ubuntu @ ec2-52-26-161-203:〜$ vim / etc / hadoop / masters
ec2-52-26-161-203.us-west-2.compute.amazonaws.com
ubuntu @ ec2-52-26-161-203:〜$ vim / etc / hadoop / slaves
ec2-52-88-167-9.us-west-2.compute.amazonaws.com
奴隶机:
ubuntu @ ec2-52-88-167-9:〜 vim / etc / hosts
172.31.29.154 ec2-52-88-167-9.us-west-2.compute.amazonaws.com
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
ubuntu @ ec2-52-88-167-9:〜 vim / etc / hadoop / slaves
ec2-52-88-167-9.us-west-2.compute.amazonaws.com
ubuntu @ ec2-52-26-161-203:〜$ sudo netstat -lpten | grep java
tcp 0 0 0.0.0.0:50070 0.0.0.0:* LISTEN 1000 569904 19910/java
tcp 0 0 0.0.0.0:50090 0.0.0.0:* LISTEN 1000 570916 20136/java
tcp 0 0 172.31.23.167:8020 0.0.0.0:* LISTEN 1000 569911 19910/java
tcp6 0 0 :::8088 :::* LISTEN 1000 571699 20278/java
tcp6 0 0 :::8030 :::* LISTEN 1000 571690 20278/java
tcp6 0 0 :::8031 :::* LISTEN 1000 571683 20278/java
tcp6 0 0 :::8032 :::* LISTEN 1000 571695 20278/java
tcp6 0 0 :::8033 :::* LISTEN 1000 571702 20278/java
Telnet命令:
ubuntu @ ec2-52-26-161-203:〜$ telnet localhost 8031
Trying ::1...
Connected to localhost.
Escape character is '^]'.
资源管理器如何使用8031端口?我没有提供上面的hadoop配置文件(coresite.xml,mapred-site.xml,hdfs-site.xml)。
答案 0 :(得分:4)
我已经在mapred-site.xml和yarn-site.xml中进行了修改,这解决了我的问题。因为我没有在yarn-site.xml中提到资源管理器的主机名属性值,所以它试图连接地址0.0.0.0,这是连接拒绝异常的原因。
<强> mapred-site.xml中强>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<强>纱-site.xml中强>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>ec2-52-26-161-203.us-west-2.compute.amazonaws.com</value>
</property>
答案 1 :(得分:1)
hadoop文档http://wiki.apache.org/hadoop/ConnectionRefused 清楚地说:
确保异常中的目标地址不是0.0.0.0 -this 意味着您实际上并未使用真实配置客户端 那个地址
你能否尝试在slave机器的主机和slave的ip entry中添加master ip entry to master。如果不需要,还要注释掉hosts文件中的所有条目。