我遇到了标题中提到的问题,我真的不知道如何修复它。我尝试了很多相关的答案提供了解决方案,论坛等等,但我无法让它沉默。
我有一台运行独立Spark Master的EC2 Ubuntu 16机器(RAM~32GB,ROM~70GB,8核心)。下面我展示我的整体配置。
spark-env.sh :
. . .
SPARK_PUBLIC_DNS=xx.xxx.xxx.xxx
SPARK_MASTER_PORT=7077
. . .
的/ etc /主机:
127.0.0.1 locahost localhost.domain ubuntu
::1 locahost localhost.domain ubuntu
localhost master # master and slave have same ip
localhost slave # master and slave have same ip
我尝试通过Intellij Idea使用以下Scala代码连接到它:
new SparkConf()
.setAppName("my-app")
.setMaster("spark://xx.xxx.xxx.xxx:7077")
.set("spark.executor.host", "xx.xxx.xxx.xxx")
.set("spark.executor.cores", "8")
.set("spark.executor.memory","20g")
此配置会生成以下日志。 master.log 包含许多行,例如:
. . .
xx/xx/xx xx:xx:xx INFO Master: Removing executor app-xxxxxxxxxxxxxx-xxxx/xx because it is EXITED
xx/xx/xx xx:xx:xx INFO Master: Launching executor app-xxxxxxxxxxxxxx-xxxx/xx on worker worker-xxxxxxxxxxxxxx-127.0.0.1-42524
worker.log 包含许多行,例如:
. . .
xx/xx/xx xx:xx:xx INFO Worker: Executor app-xxxxxxxxxxxxxx-xxxx/xxx finished with state EXITED message Command exited with code 1 exitStatus 1
xx/xx/xx xx:xx:xx INFO Worker: Asked to launch executor app-xxxxxxxxxxxxxx-xxxx/xxx for my-app
xx/xx/xx xx:xx:xx INFO SecurityManager: Changing view acls to: ubuntu
xx/xx/xx xx:xx:xx INFO SecurityManager: Changing modify acls to: ubuntu
xx/xx/xx xx:xx:xx INFO SecurityManager: Changing view acls groups to:
xx/xx/xx xx:xx:xx INFO SecurityManager: Changing modify acls groups to:
xx/xx/xx xx:xx:xx INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ubuntu); groups with view permissions: Set(); users with modify permissions: Set(ubuntu); groups with modify permissions: Set()
xx/xx/xx xx:xx:xx INFO ExecutorRunner: Launch command: "/usr/lib/jvm/java-8-openjdk-amd64/jre//bin/java" "-cp" "/usr/local/share/spark/spark-2.1.1-bin-hadoop2.7/conf/:/usr/local/share/spark/spark-2.1.1-bin-hadoop2.7/jars/*" "-Xmx4096M" "-Dspark.driver.port=34889" "-Dspark.cassandra.connection.port=9042" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@127.0.0.1:34889" "--executor-id" "476" "--hostname" "127.0.0.1" "--cores" "1" "--app-id" "app-xxxxxxxxxxxxxx-xxxx" "--worker-url" "spark://Worker@127.0.0.1:42524"
如果你愿意,here's a Gist包含我上面的日志行。
如果我尝试以下基本配置,我有0个错误,但我的应用程序只是挂起,服务器真的什么也没做。没有CPU / RAM利用率。
new SparkConf()
.setAppName("my-app")
.setMaster("spark://xx.xxx.xxx.xxx:7077")
开
/etc/hosts
我将主设备和从设备设置为同一个ip。服务器和2.11.6
上的Scala版本build.sbt
。服务器和2.1.1
上的Spark版本build.sbt
。
以下是一些Spark-UI屏幕:
所以,我想:
我猜,这可能是一个糟糕的资源配置吗?如果没有,可能是什么导致了这个?我应该如何调整配置以避免此类问题?。
如果您需要更多详细信息,请询问。
答案 0 :(得分:0)
由于我希望我的个人计算机进行编排,我更改了配置,将其设置为主服务器和服务器作为执行程序。
所以,我的 conf / spark-env.sh 将是:
# Options read by executors and drivers running inside the cluster
SPARK_LOCAL_IP=localhost #o set the IP address Spark binds to on this node
SPARK_PUBLIC_DNS=xx.xxx.xxx.xxx #PUBLIC SERVER IP
<强> CONF /从站强>:
# A Spark Worker will be started on each of the machines listed below.
xx.xxx.xxx.xxx #PUBLIC SERVER IP
<强>的/ etc /主机强>:
xx.xxx.xxx.xxx master #PUBLIC SERVER IP
xx.xxx.xxx.xxx slave #PUBLIC SERVER IP
最后Scala配置将是:
.setMaster("local[*]")
.set("spark.executor.host", "xx.xxx.xxx.xxx") //Public Server IP
.set("spark.executor.memory","16g")