我们有Paxata的多节点安装。有三个节点
Paxata核心服务器 - > Paxata Core,Mongo,Web Server Paxata Pipeline Server - >管道,火花大师 Paxata Spark Server - > Paxata火花工人节点
在此设置中,UI中的所有处理步骤似乎都没有完成。一切都无限期等待。
管道服务器继续吐出以下错误:
2017-08-24 12:02:12.488 GMT+0800 WARN [task-result-getter-2] TaskSetManager - Lost task 0.0 in stage 1.0 (TID 6, SPARK01): java.util.NoSuchElementException: None.get
at scala.None$.get(Option.scala:313)
at scala.None$.get(Option.scala:311)
at com.paxata.spark.cache.CacheManagerRemoteClientImpl.register(CacheManagerRemoteClient.scala:148)
at com.paxata.spark.cache.CacheManagerOnNodeImpl$$anonfun$getIndexCache$1.apply(CacheManagerOnNode.scala:220)
at com.paxata.spark.cache.CacheManagerOnNodeImpl$$anonfun$getIndexCache$1.apply(CacheManagerOnNode.scala:219)
at scala.Option.getOrElse(Option.scala:120)
at com.paxata.spark.cache.CacheManagerOnNodeImpl.getIndexCache(CacheManagerOnNode.scala:219)
at com.paxata.spark.PaxBootstrap$.init(PaxBootstrap.scala:13)
at org.apache.spark.ExecutorActor$$anonfun$4.apply(SimpleSparkContext.scala:286)
at org.apache.spark.ExecutorActor$$anonfun$4.apply(SimpleSparkContext.scala:285)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
答案 0 :(得分:0)
事实证明,Paxata管道服务器为Paxata Spark Server的每个节点假设一个spark worker实例。在我们的设置中,我们在Paxata Spark Server上运行了多个火花工作者。一旦我们更改配置以将火花节点的数量减少到一个,错误就会消失。
需要更改以下内容
$ cd /usr/local/paxata/spark/conf/
$ vi spark-env.sh
然后将节点的所有可用内存和内核分配给唯一的工作节点
SPARK_WORKER_CORES=2
SPARK_WORKER_MEMORY=10g
SPARK_WORKER_INSTANCES=1