Paxata安装问题

时间:2017-08-24 07:20:38

标签: apache-spark

我们有Paxata的多节点安装。有三个节点

Paxata核心服务器 - > Paxata Core,Mongo,Web Server Paxata Pipeline Server - >管道,火花大师 Paxata Spark Server - > Paxata火花工人节点

在此设置中,UI中的所有处理步骤似乎都没有完成。一切都无限期等待。

管道服务器继续吐出以下错误:

2017-08-24 12:02:12.488 GMT+0800 WARN  [task-result-getter-2] TaskSetManager - Lost task 0.0 in stage 1.0 (TID 6, SPARK01): java.util.NoSuchElementException: None.get
    at scala.None$.get(Option.scala:313)
    at scala.None$.get(Option.scala:311)
    at com.paxata.spark.cache.CacheManagerRemoteClientImpl.register(CacheManagerRemoteClient.scala:148)
    at com.paxata.spark.cache.CacheManagerOnNodeImpl$$anonfun$getIndexCache$1.apply(CacheManagerOnNode.scala:220)
    at com.paxata.spark.cache.CacheManagerOnNodeImpl$$anonfun$getIndexCache$1.apply(CacheManagerOnNode.scala:219)
    at scala.Option.getOrElse(Option.scala:120)
    at com.paxata.spark.cache.CacheManagerOnNodeImpl.getIndexCache(CacheManagerOnNode.scala:219)
    at com.paxata.spark.PaxBootstrap$.init(PaxBootstrap.scala:13)
    at org.apache.spark.ExecutorActor$$anonfun$4.apply(SimpleSparkContext.scala:286)
    at org.apache.spark.ExecutorActor$$anonfun$4.apply(SimpleSparkContext.scala:285)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

1 个答案:

答案 0 :(得分:0)

事实证明,Paxata管道服务器为Paxata Spark Server的每个节点假设一个spark worker实例。在我们的设置中,我们在Paxata Spark Server上运行了多个火花工作者。一旦我们更改配置以将火花节点的数量减少到一个,错误就会消失。

需要更改以下内容

$ cd /usr/local/paxata/spark/conf/
$ vi spark-env.sh

然后将节点的所有可用内存和内核分配给唯一的工作节点

SPARK_WORKER_CORES=2
SPARK_WORKER_MEMORY=10g
SPARK_WORKER_INSTANCES=1