Question

遵循此tutorial，并将Spark配置为Hive的执行引擎。但是，每个简单查询都会挂起

我试图

set spark.executor.memory = 1g;
set spark.driver.memory = 1g;
set spark.yarn.executor.memoryOverhead = 2048;
set spark.yarn.driver.memoryOverhead = 1024;

但这并不能解决我的问题。

在控制台中，我可以看到：活着的工人：1，正在使用的内核：1个总计，1个已使用，6.8 GB（已使用2 GB）

奇怪的是，对于正在运行的应用程序，核心为0。

想知道什么可以解决我的问题？

    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Spark Plan !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 
19/06/25 23:49:13 INFO exec.Utilities: PLAN PATH = hdfs://localhost:9000/tmp/hive/alex/d3bc099b-19db-461c-aea6-f29d1c6b05a5/hive_2019-06-25_23-48-37_916_8355031693669842913-1/-mr-10002/23946a9b-f8e0-463b-b92b-61bbb4807b66/map.xml
19/06/25 23:49:13 INFO exec.SerializationUtilities: Deserializing MapWork using kryo
19/06/25 23:49:13 INFO exec.Utilities: Deserialized plan (via FILE) - name: Map 1 size: 4.14KB
19/06/25 23:49:13 INFO io.CombineHiveInputFormat: Total number of paths: 1, launching 1 threads to check non-combinable ones.
19/06/25 23:49:13 INFO io.CombineHiveInputFormat: CombineHiveInputSplit creating pool for hdfs://localhost:9000/tmp/hive/alex/d3bc099b-19db-461c-aea6-f29d1c6b05a5/_tmp_space.db/Values__Tmp__Table__1; using filter path hdfs://localhost:9000/tmp/hive/alex/d3bc099b-19db-461c-aea6-f29d1c6b05a5/_tmp_space.db/Values__Tmp__Table__1
19/06/25 23:49:13 INFO input.FileInputFormat: Total input paths to process : 1
19/06/25 23:49:13 INFO input.CombineFileInputFormat: DEBUG: Terminated node allocation with : CompletedNodes: 1, size left: 0
19/06/25 23:49:13 INFO io.CombineHiveInputFormat: number of splits 1
19/06/25 23:49:13 INFO io.CombineHiveInputFormat: Number of all splits 1
19/06/25 23:49:13 INFO scheduler.DAGScheduler: Got job 0 (foreachAsync at RemoteHiveSparkClient.java:351) with 1 output partitions
19/06/25 23:49:13 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 (foreachAsync at RemoteHiveSparkClient.java:351)
19/06/25 23:49:13 INFO scheduler.DAGScheduler: Parents of final stage: List()
19/06/25 23:49:13 INFO scheduler.DAGScheduler: Missing parents: List()
19/06/25 23:49:13 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at mapPartitionsToPair at MapTran.java:40), which has no missing parents
19/06/25 23:49:13 INFO memory.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 255.5 KB, free 412.5 MB)
19/06/25 23:49:13 INFO memory.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 76.6 KB, free 412.4 MB)
19/06/25 23:49:13 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.0.95:38407 (size: 76.6 KB, free: 413.8 MB)
19/06/25 23:49:13 INFO spark.SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
19/06/25 23:49:13 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at mapPartitionsToPair at MapTran.java:40) (first 15 tasks are for partitions Vector(0))
19/06/25 23:49:13 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
19/06/25 23:49:28 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
19/06/25 23:49:43 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

我正在使用hive-2.3.0和spark 2.2.0，它们应根据文档兼容。但是我也尝试了最新的配置单元版本，但是它也不起作用。

Hive on Spark：资源不足

0 个答案: