我们说我们有2个蜂巢表,tableA& tableB的。 我正在爆炸tableA,用其他几个表加入它,然后插入tableB。
当tableB没有分区时插入工作正常,或者使用静态分区完成插入。
但是,当存在动态分区时,地图缩减作业甚至不会启动。它有点挂起。
要调试更多,我在初始化hive时设置了以下参数:
-hiveconf hive.root.logger=DEBUG,console
现在,我可以看到这份工作实际上并没有挂起。 它不断打印日志,如:
........
16/02/11 09:25:50 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
16/02/11 09:25:50 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2139 and EX_2140 as parent of FS_68 and child of EX_2138
16/02/11 09:25:55 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
16/02/11 09:25:55 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2141 and EX_2142 as parent of FS_68 and child of EX_2140
16/02/11 09:25:59 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
16/02/11 09:25:59 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2143 and EX_2144 as parent of FS_68 and child of EX_2142
16/02/11 09:26:03 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
16/02/11 09:26:03 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2145 and EX_2146 as parent of FS_68 and child of EX_2144
16/02/11 09:26:08 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
16/02/11 09:26:08 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2147 and EX_2148 as parent of FS_68 and child of EX_2146
16/02/11 09:26:12 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
16/02/11 09:26:12 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2149 and EX_2150 as parent of FS_68 and child of EX_2148
16/02/11 09:26:17 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
16/02/11 09:26:17 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2151 and EX_2152 as parent of FS_68 and child of EX_2150
16/02/11 09:26:19 [Thread-5]: INFO metrics.MetricsSaver: Saved 8:22 records to /mnt/var/em/raw/i-63eec5e6_20160211_RunJar_14276_raw.bin
16/02/11 09:26:21 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
16/02/11 09:26:21 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2153 and EX_2154 as parent of FS_68 and child of EX_2152
16/02/11 09:26:26 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
16/02/11 09:26:26 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2155 and EX_2156 as parent of FS_68 and child of EX_2154
16/02/11 09:26:30 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
16/02/11 09:26:30 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2157 and EX_2158 as parent of FS_68 and child of EX_2156
16/02/11 09:26:35 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
16/02/11 09:26:35 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2159 and EX_2160 as parent of FS_68 and child of EX_2158
16/02/11 09:26:40 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
16/02/11 09:26:40 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2161 and EX_2162 as parent of FS_68 and child of EX_2160
16/02/11 09:26:45 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
16/02/11 09:26:45 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2163 and EX_2164 as parent of FS_68 and child of EX_2162
16/02/11 09:26:49 [Thread-5]: INFO metrics.MetricsSaver: Saved 8:22 records to /mnt/var/em/raw/i-63eec5e6_20160211_RunJar_14276_raw.bin
16/02/11 09:26:50 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
16/02/11 09:26:50 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2165 and EX_2166 as parent of FS_68 and child of EX_2164
16/02/11 09:26:56 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
16/02/11 09:26:56 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2167 and EX_2168 as parent of FS_68 and child of EX_2166
..............
这些日志永远打印出来! 但是,如果没有动态分区,则完整的插入查询将在大约10分钟内成功完成。
此外,整个表中动态分区的不同值的数量仅为3,因此不是我使用不合适的列作为动态分区的情况。
因此,
正在打印的日志是什么意思?
这种情况需要什么样的优化/补救措施?
非常感谢您提前获得任何帮助!
答案 0 :(得分:1)
设置以下参数:
SET hive.optimize.sort.dynamic.partition=false
我的hive版本是0.13.1。 引用apache wiki这个参数:
<强> hive.optimize.sort.dynamic.partition 强>
默认值:Hive 0.13.0和0.13.1中为true; Hive 0.14.0及更高版本中的错误(HIVE-8151) 添加In:Hive 0.13.0 with HIVE-6455 启用后,动态分区列将进行全局排序。这样我们只能为reducer中的每个分区值保持一个记录编写器打开,从而减少对reducer的内存压力。
感谢。