无法使用Google Compute Engine上的Spark连接到master

时间:2015-04-09 09:23:32

标签: apache-spark google-compute-engine

我正在通过"启动点击部署软件"在Google Compute Engine中尝试hadoop / spark群集。功能。

我创建了1个主节点和2个从节点,我可以在群集上启动spark-shell但是当我想从我的计算机启动spark-shell时,我失败了。

我发布:

./bin/spark-shell --master spark://IP or Hostname:7077

我有这个stackTrace:

15/04/09 10:58:06 INFO AppClient$ClientActor: Connecting to master
akka.tcp://sparkMaster@IP or Hostname:7077/user/Master...
15/04/09 10:58:06 WARN AppClient$ClientActor: Could not connect to
akka.tcp://sparkMaster@IP or Hostname:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@IP or Hostname:7077
15/04/09 10:58:06 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@IP or Hostname:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: IP or Hostname: unknown error

请让我知道如何克服这个问题。

1 个答案:

答案 0 :(得分:2)

见Daniel Darabos的评论。默认情况下,除SSH,RDP和ICMP外,所有传入连接都被阻止。为了能够从Internet连接到hadoop主实例,您必须打开端口7077以用于' hadoop-master'首先在项目中标记:

gcloud compute --project PROJECT firewall-rules create allow-spark \
    --allow TCP:7077 \
    --target-tags hadoop-master

有关详细信息和所有可能性,请参阅GCE公共文档中的FirewallsAdding a firewallgcloud compute firewall-rules create