Apache Spark工作者超时

时间:2015-12-29 19:30:00

标签: java scala apache-spark

我一直在使用Spark遇到一个又一个问题,我相信它与网络或权限或者两者兼而有之。主数据库或工作日志中没有任何内容或抛出的错误都表明存在问题。

15/12/29 19:19:58 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
15/12/29 19:20:13 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
15/12/29 19:20:28 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
15/12/29 19:20:43 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
15/12/29 19:20:58 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
15/12/29 19:21:11 INFO AppClient$ClientEndpoint: Executor updated: app-20151229141057-0000/8 is now EXITED (Command exited with code 1)
15/12/29 19:21:11 INFO SparkDeploySchedulerBackend: Executor app-20151229141057-0000/8 removed: Command exited with code 1
15/12/29 19:21:11 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 8
15/12/29 19:21:11 INFO AppClient$ClientEndpoint: Executor added: app-20151229141057-0000/10 on worker-20151229141026-127.0.0.1-48818 (127.0.0.1:48818) with 2 cores
15/12/29 19:21:11 INFO SparkDeploySchedulerBackend: Granted executor ID app-20151229141057-0000/10 on hostPort 127.0.0.1:48818 with 2 cores, 1024.0 MB RAM
15/12/29 19:21:11 INFO AppClient$ClientEndpoint: Executor updated: app-20151229141057-0000/10 is now LOADING
15/12/29 19:21:11 INFO AppClient$ClientEndpoint: Executor updated: app-20151229141057-0000/10 is now RUNNING
15/12/29 19:21:12 INFO AppClient$ClientEndpoint: Executor updated: app-20151229141057-0000/9 is now EXITED (Command exited with code 1)
15/12/29 19:21:12 INFO SparkDeploySchedulerBackend: Executor app-20151229141057-0000/9 removed: Command exited with code 1
15/12/29 19:21:12 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 9
15/12/29 19:21:12 INFO AppClient$ClientEndpoint: Executor added: app-20151229141057-0000/11 on worker-20151229141023-127.0.0.1-35452 (127.0.0.1:35452) with 2 cores
15/12/29 19:21:12 INFO SparkDeploySchedulerBackend: Granted executor ID app-20151229141057-0000/11 on hostPort 127.0.0.1:35452 with 2 cores, 1024.0 MB RAM
15/12/29 19:21:12 INFO AppClient$ClientEndpoint: Executor updated: app-20151229141057-0000/11 is now LOADING
15/12/29 19:21:12 INFO AppClient$ClientEndpoint: Executor updated: app-20151229141057-0000/11 is now RUNNING
15/12/29 19:21:13 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

我正在尝试使用spark 1.52在Ubuntu 14.04上运行独立设置。一切似乎都配置正确,但工作似乎永远不会完成,每个工人超时。

enter image description here

这是我正在执行工作的远程机器......

enter image description here

代码只是他们的一个例子。我也尝试过Pi估计示例并遇到同样的问题。

def main(args: Array[String]) {
  val logFile = "/Users/user/spark.txt" // Should be some file on your system
  val conf = new SparkConf().setAppName("Simple App").setMaster("spark://46.101.xxx.xxx:7077")
  val sc = new SparkContext(conf)
  val logData = sc.textFile(logFile, 2).cache()
  val numAs = logData.filter(line => line.contains("a")).count()
  val numBs = logData.filter(line => line.contains("b")).count()
  println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
}

之前有没有人遇到过这个问题?如果有人能帮我解决这个问题,我将非常感激。

- 编辑 - 其他信息。

#spark-env.sh
export SPARK_LOCAL_IP="46.101.xxx.xxx"
export SPARK_MASTER_IP="46.101.xxx.xxx"
export SPARK_PUBLIC_DNS="46.101.xxx.xxx"

尝试过Java 7& Java 8与Scala 2.10.6和2.11.latest。

Master以./start-master.sh开头 工人开始使用./start-slave.sh spark://46.101.xxx.xxx:7077

在Ubuntu 14.04.3 LTS上运行。 (数字海洋) - 没有防火墙。可以从远程机器telnet到master和worker。主人和工人都在同一台机器上。

测试了Spark 1.5.2和1.5.0。在客户机(请求)和远程服务器(主服务器和工作服务器)之间保持Java,Scala和Spark的版本相同。

1 个答案:

答案 0 :(得分:0)

看起来你的应用程序无法找到工人。启动群集时,是否启动了任何从属服务器并将它们连接到主服务器?

要启动您的员工并将他们连接到主服务器,请运行以下命令:

./bin/spark-class org.apache.spark.deploy.worker.Worker spark://ip:port

spark:// ip:port 是主人的。