为端口7077配置apache spark standalone的最佳做法是什么?

时间:2016-09-07 19:19:55

标签: linux shell apache-spark

我在一个网络中有两台机器。在主机中我正在运行./sbin/spark-master.sh而在另一个中(让我们称之为从机)我正在运行./bin/spark-shell --master spark://master-machine:7077。但是我在slave-machine运行spark-shell脚本时遇到了错误。

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
16/09/07 18:56:37 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/09/07 18:56:38 WARN Utils: Your hostname, <HOSTNAME> resolves to a loopback address: 127.0.0.1; using 192.168.0.68 instead (on interface wlan0)
16/09/07 18:56:38 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
16/09/07 18:56:39 WARN StandaloneAppClient$ClientEndpoint: Failed to connect to master <master-machine's IP>:7077
org.apache.spark.SparkException: Exception thrown in awaitResult
    at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
    at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
    at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
    at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
    at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
    at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
    at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
    at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:88)
    at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:96)
    at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1$$anon$1.run(StandaloneAppClient.scala:109)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Failed to connect to /192.168.43.27:7077
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228)
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179)
    at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197)
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:191)
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)
    ... 4 more
Caused by: java.net.ConnectException: Connection refused: /<master-machine's IP>:7077
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
    ... 1 more

我也尝试了它的IP,但它不起作用。

在主机netstat -nltu中显示以下内容

Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address            Foreign Address         State      
tcp        0      0 0.0.0.0:22               0.0.0.0:*               LISTEN     
tcp6       0      0 :::8080                  :::*                    LISTEN     
tcp6       0      0 127.0.1.1:6066           :::*                    LISTEN     
tcp6       0      0 :::22                    :::*                    LISTEN     
tcp6       0      0 127.0.1.1:7077           :::*                    LISTEN     
udp        0      0 0.0.0.0:27535            0.0.0.0:*                          
udp        0      0 0.0.0.0:68               0.0.0.0:*                          
udp6       0      0 :::54678                 :::*    

端口7077仅接受来自localhost的数据包而不接受来自网络中其他计算机的数据包。

我在线尝试了大部分解决方案,但都没有奏效。直到我将SPARK_MASTER_HOST设置为主机的IP然后netstat -nltu显示

Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address            Foreign Address         State      
tcp        0      0 0.0.0.0:22               0.0.0.0:*               LISTEN     
tcp6       0      0 :::8080                  :::*                    LISTEN     
tcp6       0      0 <master-machine IP>:6066 :::*                    LISTEN     
tcp6       0      0 :::22                    :::*                    LISTEN     
tcp6       0      0 <master-machine IP>:7077 :::*                    LISTEN     
udp        0      0 0.0.0.0:27535            0.0.0.0:*                          
udp        0      0 0.0.0.0:68               0.0.0.0:*                          
udp6       0      0 :::54678                 :::* 

问题现在已经解决,但解决这个问题的最佳做法是什么?对我来说这很奇怪的原因是合理的是你希望端口7077不仅接受来自localhost的数据包而且接受其他机器的数据包。

PS:spark版本是spark-2.0.0-bin-hadoop2.7

0 个答案:

没有答案