试图让Hadoop在伪分布式模式下工作:连接被拒绝和其他错误

时间:2017-03-15 02:05:01

标签: hadoop ssh hdfs

我已经在我的Linux Mint 17.1机器上安装了Hadoop 2.7.3,并且正在关注Apache tutorial以使其运行。我一直在密切关注此页面上的说明,并且已经达到localhost并运行start-dfs.shstart-yarn.sh的程度。我还格式化了名称节点。

我的core-site.xml文件是根据教程编辑的:

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

与hdfs-site.xml一样:

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

但是,运行命令hadoop fs -mkdir /test会出现以下错误:

mkdir: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.; Host Details : local host is: "MILTON/127.0.1.1"; destination host is: "localhost":9000;

运行jps给我这个输出:

15388 Jps
14966 ResourceManager
14615 DataNode
15077 NodeManager
14787 SecondaryNameNode

查看我的日志文件,我会看到可能与此有关的更详细的错误和警告。在hadoop-user-namenode-MILTON.log中,我看到了这个错误:

org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/hadoop-user/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
    at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:327)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:215)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:975)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:681)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:585)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:645)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:812)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:796)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1493)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1559)

在hadoop-user-secondarynamenode-MILTON.log中,我看到了我在命令行上获得的异常的完整回溯:

java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.; Host Details : local host is: "MILTON/127.0.1.1"; destination host is: "localhost":9000; 
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:773)
    at org.apache.hadoop.ipc.Client.call(Client.java:1479)
    at org.apache.hadoop.ipc.Client.call(Client.java:1412)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
    at com.sun.proxy.$Proxy9.getTransactionId(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:128)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy10.getTransactionID(Unknown Source)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:641)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:649)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:393)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$1.run(SecondaryNameNode.java:361)
    at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:357)
    at java.lang.Thread.run(Thread.java:745)
Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.
    at com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99)
    at com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:498)
    at com.google.protobuf.GeneratedMessage.parseUnknownField(GeneratedMessage.java:193)
    at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.<init>(RpcHeaderProtos.java:2207)
    at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.<init>(RpcHeaderProtos.java:2165)
    at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto$1.parsePartialFrom(RpcHeaderProtos.java:2295)
    at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto$1.parsePartialFrom(RpcHeaderProtos.java:2290)
    at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200)
    at com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
    at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
    at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
    at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
    at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:3167)
    at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1085)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:979)

在日志的早些时候我也看到了这个例外:

java.net.ConnectException: Call From MILTON/127.0.1.1 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
    at org.apache.hadoop.ipc.Client.call(Client.java:1479)
    at org.apache.hadoop.ipc.Client.call(Client.java:1412)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
    at com.sun.proxy.$Proxy9.getTransactionId(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:128)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy10.getTransactionID(Unknown Source)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:641)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:649)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:393)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$1.run(SecondaryNameNode.java:361)
    at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:357)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
    at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
    at org.apache.hadoop.ipc.Client.call(Client.java:1451)
    ... 18 more

同时,datanode日志会反复记录如下所示的消息:

2017-03-14 20:50:37,785 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: localhost/127.0.0.1:9000

我还注意到,我无法访问名称节点的Web界面,如教程所示;试图转到http://localhost:9870/给我一个ERR_CONNECTION_REFUSED。我猜这个名字节点根本没有开始;当我运行stop-dfs.sh时,它表示&#34; localhost:没有要停止的名字节点&#34;。我做错了什么导致教程中的步骤失败?

注意:我知道已经发布了许多类似的问题。他们往往有一个共同点,那就是它们比我的设置更复杂(例如完全分布式模式,而不仅仅是运行基础教程),或者他们找到了与教程不同的解决方案。我有兴趣学习,具体来说,为什么我不能让教程为我工作并在应用其他我尚未理解的变化之前解决这个问题。除非Apache教程真的错了。

更新2017-3-20:我遵循了franklinsijo的建议,在添加了他推荐的属性并格式化namenode后,我在namenode日志文件中收到此异常:

java.net.BindException: Problem binding to [localhost:9000] java.net.BindException: Address already in use; For more details see:  http://wiki.apache.org/hadoop/BindException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:721)
    at org.apache.hadoop.ipc.Server.bind(Server.java:425)
    at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:574)
    at org.apache.hadoop.ipc.Server.<init>(Server.java:2215)
    at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:951)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server.<init>(ProtobufRpcEngine.java:534)
    at org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:509)
    at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:796)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.<init>(NameNodeRpcServer.java:345)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:674)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:647)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:812)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:796)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1493)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1559)
Caused by: java.net.BindException: Address already in use
    at sun.nio.ch.Net.bind0(Native Method)
    at sun.nio.ch.Net.bind(Net.java:463)
    at sun.nio.ch.Net.bind(Net.java:455)
    at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
    at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
    at org.apache.hadoop.ipc.Server.bind(Server.java:408)
    ... 13 more

评论&#34; 9000端口&#34;我的/ etc / ssh / sshd_config文件中的行不会更改此异常。

1 个答案:

答案 0 :(得分:0)

默认情况下,hadoop.tmp.dir/tmp/hadoop-${user.name}

tmp.dir用作dfs.namenode.name.dir的基本目录file://${hadoop.tmp.dir}/dfs/name/tmp中的文件在重新启动时将丢失。因此,重新启动后,之前格式化的namenode目录不再可用。

如果没有一致的存储目录,则无法启动NameNode守护程序。因此,例外

org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/hadoop-user/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible

将此属性添加到core-site.xml

<property>
   <name>hadoop.tmp.dir</name>
   <value>/home/user/dfs</value> <!-- Any directory, other than /tmp, that exists and the user running hadoop has access -->
</property>

或者通过将这些属性添加到hdfs-site.xml

来明确定义用于存储元数据和数据块的目录
<property>
   <name>dfs.namenode.name.dir</name>
   <value>/home/user/dfs/name</value>
</property>
<property>
   <name>dfs.datanode.data.dir</name>
   <value>/home/user/dfs/data</value>
</property>

注意:您也可以同时执行这两项操作。

然后格式化NameNode并启动服务。其他错误来自服务,因为它们无法与NameNode守护程序联系(因为它未运行)。仅当NameNode还活着时,才能访问Web界面。