我在提交工作时偶尔会面临以下错误。如果我删除了fieldao,datadao和sqldao的rootdir,这个错误就消失了。这意味着我必须重新启动作业服务器并重新上传我的jar。
{
"status": "ERROR",
"result": {
"message": "Ask timed out on [Actor[akka://JobServer/user/context-supervisor/1995aeba-com.spmsoftware.distributed.job.TestJob#-1370794810]] after [10000 ms]. Sender[null] sent message of type \"spark.jobserver.JobManagerActor$StartJob\".",
"errorClass": "akka.pattern.AskTimeoutException",
"stack": ["akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604)", "akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126)", "scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)", "scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)", "scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599)", "akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:331)", "akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:282)", "akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:286)", "akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:238)", "java.lang.Thread.run(Thread.java:745)"]
}
}
我的配置文件如下:
# Template for a Spark Job Server configuration file
# When deployed these settings are loaded when job server starts
#
# Spark Cluster / Job Server configuration
# Spark Cluster / Job Server configuration
spark {
# spark.master will be passed to each job's JobContext
master = <spark_master>
# Default # of CPUs for jobs to use for Spark standalone cluster
job-number-cpus = 4
jobserver {
port = 8090
context-per-jvm = false
context-creation-timeout = 100 s
# Note: JobFileDAO is deprecated from v0.7.0 because of issues in
# production and will be removed in future, now defaults to H2 file.
jobdao = spark.jobserver.io.JobSqlDAO
filedao {
rootdir = /tmp/spark-jobserver/filedao/data
}
datadao {
rootdir = /tmp/spark-jobserver/upload
}
sqldao {
slick-driver = slick.driver.H2Driver
jdbc-driver = org.h2.Driver
rootdir = /tmp/spark-jobserver/sqldao/data
jdbc {
url = "jdbc:h2:file:/tmp/spark-jobserver/sqldao/data/h2-db"
user = ""
password = ""
}
dbcp {
enabled = false
maxactive = 20
maxidle = 10
initialsize = 10
}
}
result-chunk-size = 1m
short-timeout = 60 s
}
context-settings {
num-cpu-cores = 2 # Number of cores to allocate. Required.
memory-per-node = 512m # Executor memory per node, -Xmx style eg 512m, #1G, etc.
}
}
akka {
remote.netty.tcp {
# This controls the maximum message size, including job results, that can be sent
# maximum-frame-size = 200 MiB
}
}
# check the reference.conf in spray-can/src/main/resources for all defined settings
spray.can.server.parsing.max-content-length = 250m
我正在使用spark-2.0-preview
版本。
答案 0 :(得分:0)
之前我遇到过同样的错误并且与超时有关,当然是同步请求(sync = true)要求你必须提供超时(以秒为单位),这是一个值相对于处理你需要多长时间的值请求。
这个例子的示例如下:
curl -k --basic -d '' 'http://localhost:5050/jobs?appName=app&classPath=Main&context=test-context&sync=true&timeout=40'
如果您的请求需要超过40秒,您可能还需要修改
上的application.confspark-jobserver-master/job-server/src/main/resources/application.conf
单击 spray.can.server 部分修改:
idle-timeout = 210 s
request-timeout = 200 s