Spark Cluster启动问题

时间:2017-01-11 13:24:56

标签: hadoop apache-spark cluster-computing iptables

我是新手,并试图设置火花群。我做了以下事情来设置和检查spark集群的状态,但不确定状态。

我试图在浏览器中检查master-ip:8081(8080,4040,4041),但没有看到任何结果。首先,我设置并启动hadoop集群。

 JPS gives:

 2436 SecondaryNameNode
 2708 NodeManager
 2151 NameNode
 5495 Master
 2252 DataNode
 2606 ResourceManager
 5710 Jps

问题(是否有必要启动hadoop?)

在Master / usr / local / spark / conf / slaves

 localhost
 slave-node-1
 slave-node-2

现在,启动Spark;大师入门

  $SPARK_HOME/sbin/start-master.sh 

测试
  ps -ef|grep spark
  hduser    5495     1  0 18:12 pts/0    00:00:04 /usr/local/java/bin/java -cp /usr/local/spark/conf/:/usr/local/spark/jars/*:/usr/local/hadoop/etc/hadoop/ -Xmx1g org.apache.spark.deploy.master.Master --host master-hostname --port 7077 --webui-port 8080

在从属节点1上

 $SPARK_HOME/sbin/start-slave.sh spark://205.147.102.19:7077

使用

进行测试
 ps -ef|grep spark
 hduser    1847     1 20 18:24 pts/0    00:00:04 /usr/local/java/bin/java -cp /usr/local/spark/conf/:/usr/local/spark/jars/* -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://master-ip:7077

在从节点2上相同

  $SPARK_HOME/sbin/start-slave.sh spark://master-ip:7077
  ps -ef|grep spark
  hduser    1948     1  3 18:18 pts/0    00:00:03 /usr/local/java/bin/java -cp /usr/local/spark/conf/:/usr/local/spark/jars/* -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://master-ip:7077

我无法在Web控制台上看到任何火花..所以我认为问题可能与防火墙有关。这是我的iptables ..

  iptables -L -nv
  Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
  pkts bytes target     prot opt in     out     source               destination         
  6136  587K fail2ban-ssh  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            multiport dports 22
  151K   25M ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            state RELATED,ESTABLISHED
  6   280 ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0           
  579 34740 ACCEPT     all  --  lo     *       0.0.0.0/0            0.0.0.0/0           
  34860 2856K ACCEPT     all  --  eth1   *       0.0.0.0/0            0.0.0.0/0           
  145  7608 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            state NEW tcp dpt:22
  56156 5994K REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-host-prohibited
  0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:8080
  0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:8081

  Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
  pkts bytes target     prot opt in     out     source               destination         
  0     0 REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-host-prohibited

 Chain OUTPUT (policy ACCEPT 3531 packets, 464K bytes)
 pkts bytes target     prot opt in     out     source               destination         

 Chain fail2ban-ssh (1 references)
 pkts bytes target     prot opt in     out     source               destination         
 2   120 REJECT     all  --  *      *       218.87.109.153       0.0.0.0/0            reject-with icmp-port-unreachable
 5794  554K RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           

我正在尽我所能看看是否设置了spark-cluster以及如何正确检查它。如果群集已设置,为什么我无法在Web控制台上检查?可能有什么不对?任何指针都会有所帮助......

编辑 - 在spark-shell --master本地命令(在主站中)之后添加日志

 17/01/11 18:12:46 INFO util.Utils: Successfully started service 'sparkMaster' on port 7077.
 17/01/11 18:12:47 INFO master.Master: Starting Spark master at spark://master:7077
 17/01/11 18:12:47 INFO master.Master: Running Spark version 2.1.0
 17/01/11 18:12:47 INFO util.log: Logging initialized @3326ms
 17/01/11 18:12:47 INFO server.Server: jetty-9.2.z-SNAPSHOT
 17/01/11 18:12:47 INFO handler.ContextHandler: Started   o.s.j.s.ServletContextHandler@20f0b5ff{/app,null,AVAILABLE}
 17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@734e74b2{/app/json,null,AVAILABLE}
 17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1bc45d76{/,null,AVAILABLE}
 17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6a274a23{/json,null,AVAILABLE}
 17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4f5d45d5{/static,null,AVAILABLE}
 17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4fb65368{/app/kill,null,AVAILABLE}
 17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@76208805{/driver/kill,null,AVAILABLE}
 17/01/11 18:12:47 INFO server.ServerConnector: Started ServerConnector@258dbadd{HTTP/1.1}{0.0.0.0:8080}
 17/01/11 18:12:47 INFO server.Server: Started @3580ms
 17/01/11 18:12:47 INFO util.Utils: Successfully started service 'MasterUI' on port 8080.
 17/01/11 18:12:47 INFO ui.MasterWebUI: Bound MasterWebUI to 0.0.0.0, and started at http://master:8080
 17/01/11 18:12:47 INFO server.Server: jetty-9.2.z-SNAPSHOT
 17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1cfbb7e9{/,null,AVAILABLE}
 17/01/11 18:12:47 INFO server.ServerConnector: Started ServerConnector@2f7af4e{HTTP/1.1}{master:6066}
 17/01/11 18:12:47 INFO server.Server: Started @3628ms
 17/01/11 18:12:47 INFO util.Utils: Successfully started service on port 6066.
 17/01/11 18:12:47 INFO rest.StandaloneRestServer: Started REST server for submitting applications on port 6066
 17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@799d5f4f{/metrics/master/json,null,AVAILABLE}
 17/01/11 18:12:47 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@647c46e3{/metrics/applications/json,null,AVAILABLE}
 17/01/11 18:12:47 INFO master.Master: I have been elected leader! New state: ALIVE

在从属节点中 -

 17/01/11 18:22:46 INFO Worker: Connecting to master master:7077...
 17/01/11 18:22:46 WARN Worker: Failed to connect to master master:7077

Tonnes of java errors ..

 17/01/11 18:31:18 ERROR Worker: All masters are unresponsive! Giving up.

2 个答案:

答案 0 :(得分:1)

创建SparkContext时,Spark Web UI启动

尝试运行spark-shell --master yourmaster:7077,然后打开Spark UI。您还可以使用spark-sumit提交一些应用程序,然后创建SparkContext。

示例spark-submit,来自Spark documentation

./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master spark://207.184.161.138:7077 \
  --deploy-mode cluster \
  --supervise \
  --executor-memory 20G \
  --total-executor-cores 100 \
  /path/to/examples.jar \
  1000

回答第一个问题:如果要使用HDFS或YARN,则必须启动Hadoop组件。如果没有,他们就无法启动

您也可以转到/etc/hosts/并删除127.0.0.1行或将Spark配置中的MASTER_IP变量设置为正确的主机名

答案 1 :(得分:0)

问题是IP表。大多数其他事情都很好。所以我只是按照https://wiki.debian.org/iptables的说明来修复IP表,它对我有用。只有你应该知道哪些端口将用于spark / hadoop等。我打开了8080,54310,50070,7077(许多用于hadoop和spark安装的默认设置)......