我们正在使用Hive 1.3.1,并运行INSERT
语句将数据从HDFS上传到Hive
explain insert OVERWRITE table managedtable PARTITION(col1) select [columns] from externaltable
我注意到昨天与昨天相比,执行计划在同一张桌子上发生了变化。昨天该计划导致M / R工作有341个映射器和359个减速器,而今天该计划导致M / R工作只有映射器而没有减速器。
我真的很困惑。
这些是计划(省略列表,因为该表有超过300列。
OK STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
Stage-2 depends on stages: Stage-0
STAGE PLANS:
Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: externalevents
Statistics: Num rows: 479391 Data size: 91824480256 Basic stats: COMPLETE Column stats: NONE
Select Operator [columns] outputColumnNames: [columns] Statistics: Num rows: 479391 Data size: 91824480256 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col394 (type: bigint)
sort order: +
Map-reduce partition columns: _col394 (type: bigint)
Statistics: Num rows: 479391 Data size: 91824480256 Basic stats: COMPLETE Column stats: NONE
value expressions: [columns] Reduce Operator Tree:
Select Operator
expressions: [columns] outputColumnNames: [columns] Statistics: Num rows: 479391 Data size: 91824480256 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 479391 Data size: 91824480256 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
name: ceazip.events_test_hive
Stage: Stage-0
Move Operator
tables:
partition:
evtf_first_date_id
replace: true
table:
input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
name: ceazip.events_test_hive
Stage: Stage-2
Stats-Aggr Operator
And the 2nd plan: Stage-1 is a root stage Stage-7 depends on stages: Stage-1 , consists of Stage-4, Stage-3, Stage-5 Stage-4 Stage-0 depends on stages: Stage-4, Stage-3, Stage-6 Stage-2 depends on stages: Stage-0 Stage-3 Stage-5 Stage-6 depends on stages: Stage-5
STAGE PLANS: Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: events_2017_02_21_11_13_18
Statistics: Num rows: 468680 Data size: 89772957696 Basic stats: COMPLETE Column stats: NONE
Select Operator
Statistics: Num rows: 468680 Data size: 89772957696 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 468680 Data size: 89772957696 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
name: default.events_test1
Stage: Stage-7
Conditional Operator
Stage: Stage-4
Move Operator
files:
hdfs directory: true
destination: hdfs://isr-r0-aps-nam-1.lab.il.nice.com:8020/apps/hive/warehouse/events_test1/.hive-staging_hive_2017-03-01_07-35-18_776_4958999242494325333-1/-ext-10000
Stage: Stage-0
Move Operator
tables:
partition:
evtf_first_date_id
replace: true
table:
input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
name: default.events_test1
Stage: Stage-2
Stats-Aggr Operator
Stage: Stage-3
Merge File Operator
Map Operator Tree:
ORC File Merge Operator
merge level: stripe
input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
Stage: Stage-5
Merge File Operator
Map Operator Tree:
ORC File Merge Operator
merge level: stripe
input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
Stage: Stage-6
Move Operator
files:
hdfs directory: true
destination: hdfs://isr-r0-aps-nam-1.lab.il.nice.com:8020/apps/hive/warehouse/events_test1/.hive-staging_hive_2017-03-01_07-35-18_776_4958999242494325333-1/-ext-10000
谢谢, 利奥尔