Hive Lateral View使用where子句进行爆炸 - 首先运行的是什么

时间:2017-05-09 04:03:15

标签: hadoop hive hiveql hadoop2

我试图了解WHERE子句是否在hive中使用LATERAL VIEW EXPLODE之后或之前运行。

例如,如果我有

   SELECT *
FROM
   (
    SELECT
        a1,
        a2,
        b.ds,
        conv_list.threshold_conv[0]
            AS t
    FROM
       t1 b
    LATERAL VIEW EXPLODE({list})
                conv_list as threshold_conv
    WHERE
        b.ds between '{DATE-29}' and '{DATE}'
  )

将在横向视图爆炸之前或之后运行ds过滤器吗?

1 个答案:

答案 0 :(得分:2)

  • 如果您的过滤列是表中的分区,那么这就是分区的主要目的,即使where子句超出子查询(谓词下推)
  • 横向视图有时可能是一项昂贵的操作,因此Hive在应用横向视图之前应用过滤器,请根据您的查询查看以下执行计划(不同)。

          STAGE PLANS:   Stage: Stage-1
        Map Reduce
          Map Operator Tree:
              TableScan
                alias: a
                filterExpr: ((mycolumndpartitioned > 0) and (mycolumn= 112623934)) (type: boolean)
                Statistics: Num rows: 23953585 Data size: 52793067242 Basic stats: COMPLETE Column stats: NONE
                Filter Operator
                  predicate: (mycolumn= 112623934) (type: boolean)
                  Statistics: Num rows: 11976792 Data size: 26396532519 Basic stats: COMPLETE Column stats: NONE
                  Lateral View Forward
                    Statistics: Num rows: 11976792 Data size: 26396532519 Basic stats: COMPLETE Column stats: NONE
                    Select Operator
                      Statistics: Num rows: 11976792 Data size: 26396532519 Basic stats: COMPLETE Column stats: NONE
                      Lateral View Join Operator
                        outputColumnNames: _col13
                        Statistics: Num rows: 23953584 Data size: 52793065038 Basic stats: COMPLETE Column stats: NONE
                        Select Operator
                          expressions: _col13.myArray (type: string)
                          outputColumnNames: _col0
                          Statistics: Num rows: 23953584 Data size: 52793065038 Basic stats: COMPLETE Column stats: NONE
                          File Output Operator
                            compressed: false
                            Statistics: Num rows: 23953584 Data size: 52793065038 Basic stats: COMPLETE Column stats: NONE
    
  • 现在,如果您的过滤器使用爆炸数组中的某些字段,我认为Hive将尝试应用所有可能的过滤器,这些过滤器不会使用爆炸数据中的任何列,然后再应用横向视图然后应用您的爆炸数据过滤器