是否正确启动了作业服务器以及内存不足响应

时间:2017-04-08 05:22:49

标签: apache-spark spark-jobserver

我正在使用spark-jobserver-0.6.2-spark-1.6.1

(1)导出OBSERVER_CONFIG = /custom-spark-jobserver-config.yml

(2)./ server_start.sh

执行上述启动shell文件时返回没有错误。但是,它创建了一个pid文件:spark-jobserver.pid

当我cat spark-jobserver.pid时,pid文件显示pid = 126633

然而,当我跑

lsof -i:9999 | grep LISTEN

显示

java 126634 spark 17u IPv4 189013362 0t0 TCP *:distinct(LISTEN)

我将scala应用程序部署到下面的作业服务器,它返回OK

curl --data-binary @ analytics_2.10-1.0.jar myhost:8090 / jars / myservice

当我运行以下curl命令来测试部署在作业服务器上的REST服务

curl -d" {data.label.index:15,data.label.field:ROOT_CAUSE,input.stri ng:[\" tt:获取操作。 \"]}" '为myhost:8090 /职位? APPNAME =为MyService&安培; CLASSPATH = com.test.Test与同步=真安培;超时= 400'

我得到以下内存返回响应

{   " status":" ERROR",   "结果":{     " errorClass":" java.lang.RuntimeException",     "原因":"无法创建新的原生线程",     " stack":[" java.lang.Thread.start0(Native Method)"," java.lang.Thread.start(Thread.java:714)&# 34;," org.spark-project.jetty.util.thread.QueuedThreadP ool.startThread(QueuedThreadPool.java:441)"," org.spark-project.jetty.util.thread .QueuedThreadPool.doStart(QueuedThreadPool.java:108)"," org.spark-pr oject.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)",&# 34; org.spark-project.jetty.util.component.AggregateLifeCycle.doStart(Ag gregateLifeCycle.java:81)"," org.spark-project.jetty.server.handler.AbstractHandler.doStart( AbstractHandler.java:58)" ;," org.spark-project.jetty.serve r.handler.HandlerWrapper.doStart(HandlerWrapper.java:96)"," org.spark -project.jetty.server.Server.doStart(Server.java:282)"," org.spark-project.jetty .util.component.Abst ractLifeCycle.start(AbstractLifeCycle.java:64)"," org.apache.spark.ui.JettyUtils $ .org $ apache $ spark $ ui $ JettyUtils $$ connect $ 1(Jetty Utils.scala:252 )"," org.apache.spark.ui.JettyUtils $$ anonfun $ 5.apply(JettyUtils.scala:262)"," org.apache.spark.ui.JettyUtils $$ anonfun $ 5.apply(JettyUti ls.scala:262)"," org.apache.spark.util.Utils $$ anonfun $ startServiceOnPort $ 1.apply $ mcVI $ sp(Utils.scala:1988 )"," scala.collection.immutable.Range.foreac h $ mVc $ sp(Range.scala:141)"," org.apache.spark.util.Utils $ .startServiceOnPort(Utils.scala:1979)"," org.apache.spark.ui.JettyUtils $ .startJettyServer(Je ttyUtils.scala:262)"," org.apache .spark.ui.WebUI.bind(WebUI.scala:137)"," org.apache.spark.SparkContext $$ anonfun $ 13.apply(SparkContext.scala:481)",& #34; org.apache.spark.SparkContext $$ anonfun $ 13.apply(SparkContext.scala:481)"," scala.Option.foreach(Option.scala:236)"," org.apache.spark.SparkContext。(SparkContext.scala:481)"," spark.jobserver.context.DefaultSparkContextFactory $$ anon $ 1.(SparkContextFactory.scala:53)",&# 34; spark.jobserver.co ntext.DefaultSparkContextFactory.makeContext(SparkContextFactory.scala:53)"," spark.jobserver.context.DefaultSparkContextFactory.makeContext(SparkCon textFactory.scala:48)", " spark.jobserver.context.SparkContextFactory $ class.makeContext(SparkContextFactory.scala:37)"," spark.jobserver.context.Defau ltSparkContextFactory.makeContext(SparkContextFactory.scala:48)&# 34;," spark.jobserver.JobManagerActor.createContextFromConfig(JobManagerActor.scala:378)"," spark.jobserver.JobManagerActor $$ anonfun $ wrappedReceive $ 1.applyOrEl se(JobManagerActor.scala:122)"," scala.runtime.AbstractPartialFunction $ mcVL $ sp .apply $ mcVL $ sp(AbstractPartialFunction.scala:33)"," scala。 runtime.AbstractPartialFunction $ mcVL $ sp.apply(AbstractPartialFunction.scala:33)"," scala.ru ntime.AbstractPartialFunction $ mcVL $ sp.apply(AbstractPartialFunction.scala:25)",& #34; ooyala.common.akka.ActorStack $$ anonfun $ receive $ 1.applyOrElse(ActorSt ack.scala:33)"," scala.runtime.AbstractPartialFunction $ mcVL $ sp.apply $ mcVL $ sp (AbstractPartialFunction.scala:33)"," scala.runtime.AbstractPartialFuncti on $ mcVL $ sp.apply(AbstractPartialFunction.scala:33)"," scala.runtime.AbstractPartialFunction $ mcVL $ sp.apply(AbstractPartialFunction.scala:25)"," ooyala .common.akka.Slf4jLogging $$ anonfun $ receive $ 1 $$ anonfun $ applyOrElse $ 1.apply $ mcV $ sp(Slf4jLogging.scala :26)&#34 ;," ooyala.common.akka.Slf4jLogging $ class.ooya la $ common $ akka $ Slf4jLogging $$ withAkkaSourceLogging(Slf4jLogging.scala:35)"," ooyala.common.akka.Slf4jLogging $$ anonfun $获得$ 1.applyOrElse(Slf4jLogg ing.scala:25)"," scala.runtime.AbstractPartialFunction $ mcVL $ sp.apply $ mcVL $ sp(AbstractPartialFunction.scala:33)&#34 ;," scala.runtime.AbstractPartialFuncti on $ mcVL $ sp.apply(AbstractPartialFunction.scala:33)"," scala.runtime.AbstractPartialFunction $ mcVL $ sp.apply(AbstractPartialFunction.scala: 25)"," ooyala .common.akka.ActorMetrics $$ anonfun $ receive $ 1.applyOrElse(ActorMetrics.scala:24)"," akka.actor.Actor $ class。 aroundReceive(Actor.scala:467)"," ooyala.co mmon.akka.InstrumentedActor.aroundReceive(InstrumentedActor.scala:8)"," akka.actor.ActorCell.receiveMessage (ActorCell.scala:516)&#34 ;, " akka.actor.ActorC ell.invoke(ActorCell.scala:487)"," akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)",&#34 ; akka.dispatch.Mailbox.run(Mailbox.scala:220)"," akka.di spatch.ForkJoinExecutorConfigurator $ AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)"," scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask .java:260)"," scala.concurrent.forkjoin.ForkJoinPool $ WorkQueue.runTask(ForkJoinPool.java:1339)",&# 34; scala.concurrent.forkjoin.ForkJoinPool.runWorker(Fo rkJoinPool.java:1979)" ;," scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)"],     " causeClass":" java.lang.OutOfMemoryError",     " message":" java.lang.OutOfMemoryError:无法创建新的本机线程"

我的问题

(1)为什么processID不同,如pid文件所示? 126633 vs 126634?

(2)为什么要创建spark-jobserver.pid?这是否意味着火花作业服务器未正确启动?

(3)如何正确启动作业服务器?

(4)导致内存不足的原因是什么?怎么解决?这是因为我没有正确设置堆大小或内存吗?如何解决?

1 个答案:

答案 0 :(得分:0)

  1. Jobserver绑定到8090而不是9999,可能是您应该查找该进程ID。

  2. 创建Spark jobserver pid用于跟踪目的。这并不意味着作业服务器未正确启动。

  3. 您正在正确启动spark-jobserver。

  4. 可以尝试增加JOBSERVER_MEMORY的值,默认为1G。您是否检查过Spark UI是否正确启动了应用程序?