spark-sql卡在BlockManagerInfo:在计算sql语句时添加了broadcast_0_piece0

时间:2017-09-07 06:28:36

标签: apache-spark apache-spark-sql

输出为此并停留在最后一行:

17/09/07 06:01:35 INFO ClientCnxn: Socket connection established to 10.0.0.193/10.0.0.193:2181, initiating session  
17/09/07 06:01:35 INFO ClientCnxn: Session establishment complete on server 10.0.0.193/10.0.0.193:2181, sessionid = 0x15e4bc9518103cc, negotiated timeout = 40000  
17/09/07 06:01:35 INFO RegionSizeCalculator: **Calculating region sizes for table "event_data".**  
17/09/07 06:01:35 INFO SparkContext: Starting job: processCmd at CliDriver.java:376  
17/09/07 06:01:36 INFO DAGScheduler: Got job 0 (processCmd at CliDriver.java:376) with 1 output partitions  
17/09/07 06:01:36 INFO DAGScheduler: Final stage: ResultStage 0 (processCmd at CliDriver.java:376)  
17/09/07 06:01:36 INFO DAGScheduler: Parents of final stage: List()  
17/09/07 06:01:36 INFO DAGScheduler: Missing parents: List()  
17/09/07 06:01:36 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[4] at processCmd at CliDriver.java:376), which has no missing parents  
17/09/07 06:01:36 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 16.6 KB, free 414.1 MB)  
17/09/07 06:01:36 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 8.8 KB, free 414.1 MB)  
17/09/07 06:01:36 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 10.0.0.199:43329 (size: 8.8 KB, free: 414.4 MB)  
17/09/07 06:01:36 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1007  
17/09/07 06:01:36 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[4] at processCmd at CliDriver.java:376) (first 15 tasks are for partitions Vector(0))  
17/09/07 06:01:36 INFO YarnScheduler: Adding task set 0.0 with 1 tasks  
17/09/07 06:01:37 INFO ExecutorAllocationManager: Requesting 1 new executor because tasks are backlogged (new desired total will be 1)  
17/09/07 06:01:42 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.0.0.248:55616) with ID 1  
17/09/07 06:01:42 INFO ExecutorAllocationManager: New executor 1 has registered (new total is 1)  
17/09/07 06:01:42 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, ip-10-0-0-248.cn-north-1.compute.internal, executor 1, partition 0, RACK_LOCAL, 5053 bytes)  
17/09/07 06:01:42 INFO BlockManagerMasterEndpoint: Registering block manager ip-10-0-0-248.cn-north-1.compute.internal:34192 with 2.8 GB RAM, BlockManagerId(1, ip-10-0-0-248.cn-north-1.compute.internal, 34192, None)  
17/09/07 06:01:42 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on ip-10-0-0-248.cn-north-1.compute.internal:34192 (size: 8.8 KB, free: 2.8 GB)  
17/09/07 06:01:42 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on ip-10-0-0-248.cn-north-1.compute.internal:34192 (size: 28.8 KB, free: 2.8 GB)  

Spark SQL连接到hive和名为event_data的表,它是存储在hbase中的外部表。
此外,如果我操作一个hive表(不是来自hbase),例如select count(*) from mytest01,它将会成功。

有时会被卡在BlockManagerInfo: Removed

17/09/07 06:31:18 INFO ContextCleaner: Cleaned accumulator 1  
17/09/07 06:31:18 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 10.0.0.199:43329 in memory (size: 28.8 KB, free: 414.4 MB)  
17/09/07 06:31:18 INFO BlockManagerInfo: Removed broadcast_0_piece0 on ip-10-0-0-248.cn-north-1.compute.internal:34192 in memory (size: 28.8 KB, free: 2.8 GB)  

如何解决这个问题?谢谢。

0 个答案:

没有答案