Impala查询执行顺序

时间:2016-11-07 22:04:02

标签: impala

我运行了Impala查询的“解释”并获得了以下结果。我试图理解它:执行顺序是否自下而上?那么如果数字不是执行顺序,那么数字是什么意思呢?谢谢!

   Estimated Per-Host Requirements: Memory=2.08GB VCores=2
    WARNING: The following tables are missing relevant table and/or column statistics.
    my_db.v1, my_db.v2

    10:EXCHANGE [UNPARTITIONED]
    |
    06:ANALYTIC
    |  functions: last_value(my_v_id), last_value(my__arrival_ts), last_value(version)
    |  partition by: id, trunc(my__arrival_ts, 'D')
    |  order by: my__arrival_ts ASC
    |  window: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    |
    05:SORT
    |  order by: id ASC NULLS FIRST, trunc(my__arrival_ts, 'D') ASC NULLS FIRST, my__arrival_ts ASC
    |
    09:EXCHANGE [HASH(id,trunc(my__arrival_ts, 'D'))]
    |
    04:ANALYTIC
    |  functions: last_value(build)
    |  partition by: version
    |  order by: my__arrival_day ASC
    |  window: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    |
    03:SORT
    |  order by: version ASC NULLS FIRST, my__arrival_day ASC
    |
    08:EXCHANGE [HASH(version)]
    |
    02:HASH JOIN [INNER JOIN, BROADCAST]
    |  hash predicates: v1__fk = v1.id
    |  runtime filters: RF000 <- v1.id
    |
    |--07:EXCHANGE [BROADCAST]
    |  |
    |  00:SCAN HDFS [my_db.v1]
    |     partitions=1791/2994 files=1956 size=125.30MB
    |     predicates: my__is_external
    |
    01:SCAN HDFS [my_db.vm]
       partitions=2058/2058 files=2094 size=9.98GB
       runtime filters: RF000 -> v1__fk

1 个答案:

答案 0 :(得分:0)

数字只是规划师使用的PlannodeId。执行顺序在逻辑上是自下而上的,但是在运行时,整个计划树被分成多个计划片段,这些片段由一个协调器同时执行并希望分发。您可以参考impala的配置文件网页,默认情况下,在端口25000上查看动态执行过程。

set explain_level=3;为您提供完整的碎片化计划结果。