spark独立集群slave无法将slave连接到master

时间:2016-02-22 09:07:34

标签: java linux ssh apache-spark cluster-computing

我有用于spark-1.6.0-bin-hadoop2.6的bin有问题试图将从属设备连接到主设备 到目前为止我已经尝试过(在ubuntu 14.04 live usb上):

  1. apt-get清除并在两个系统上安装openssh-client和server

  2. 我已在工作人员的spark url中明确说明了master的ip地址 火花://< master ip>:7077并尝试更改/conf/spark-env.sh中的SPARK_MASTER_IP,但是日志有以下错误 我假设必须有一些ssh设置,但我已经尝试过ssh-keygen和ssh-copy-id @它也没有给任何重新提供

    16/02/22 07:49:16 INFO Worker: Connecting to master 192.168.0.208:7077...
    16/02/22 07:49:16 WARN Worker: Failed to connect to master 192.168.0.208:7077
    java.io.IOException: Failed to connect to /192.168.0.208:7077
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:216)
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:167)
    at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:200)
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:183)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
    Caused by: java.net.ConnectException: Connection refused: /192.168.0.208:7077
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:740)
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
    ... 1 more
    

    16/02/22 07:49:27 INFO工作人员:重试与主人的联系(尝试#2)

  3. 然而,我可以通过在浏览器上键入:8080来打开主WebUI。我也可以从主服务器访问从服务器的webUI。即将到来的时候,请等一下,谢谢你。

1 个答案:

答案 0 :(得分:0)

确保每个主人和工作人员都有防火墙例外,以允许所有其他工作人员和主人的连接。

以下是我们的一台主机(master0)的简化示例:

$iptables -L

...
ACCEPT     all  --  worker0.company.com  master0.company.com
ACCEPT     all  --  worker1.company.com  master0.company.com
ACCEPT     all  --  master1.company.com  master0.company.com
...

当然,您也可以使用IP而不是主机名。