SQL中的重复列名(Impala)

时间:2018-02-14 23:47:14

标签: sql where-clause impala partition

我的下面的SQL返回错误AnalysisException: Duplicate column name: all_periods_int

DROP TABLE IF EXISTS temp_nielsen_other_upc_all_markets_test;

CREATE TABLE IF NOT EXISTS temp_nielsen_other_upc_all_markets_test
PARTITIONED BY (all_periods_int)
STORED AS PARQUET
AS
SELECT          
                    COALESCE(gap.all_markets,fct.all_markets) AS all_markets,
                    COALESCE(gap.all_periods, fct.all_periods) AS all_periods,
                    NVL(gap.dollar,0) - NVL(fct.dollar,0) AS dollar,
                    (CAST(SUBSTR(COALESCE(gap.all_periods,fct.all_periods),7,2) AS int)+2000)*10000+CAST(SUBSTR(COALESCE(gap.all_periods,fct.all_periods),1,2) AS int)*100+CAST(SUBSTR(COALESCE(gap.all_periods,fct.all_periods),4,2) AS int) AS all_periods_int
FROM                temp_nielsen_gap_total_all_markets_raw gap
FULL OUTER JOIN(

                    SELECT
                                        all_markets,
                                        all_periods,
                                        SUM(dollar) AS dollar
                    FROM                temp_nielsen_sku_facts_all_markets_raw
                    GROUP BY            all_markets,
                                        all_periods
                ) fct
ON(                 gap.all_markets = fct.all_markets AND
                    gap.all_periods = fct.all_periods
    )
WHERE           ABS(NVL(gap.dollar,0) - NVL(fct.dollar,0)) > 1 AND
                fct.all_markets in (SELECT DISTINCT all_markets FROM temp_nielsen_gap_total_all_markets_raw);

all_periods_int由此SQL创建,并且在两个基础表中的任何一个中都不存在。

另外,另一件奇怪的事情是,以下两种情况都运行良好:

  1. WHERE声明fct.all_markets in (SELECT DISTINCT all_markets FROM temp_nielsen_gap_total_all_markets_raw)声明中运行不带第二个条件的完整声明;

  2. 只需在没有SELECT声明的情况下运行CREATE TABLE语句。

  3. 我没有看到任何地方all_periods_int会重复。

1 个答案:

答案 0 :(得分:-1)

所以问题是,分区变量只能在PARTITION BY子句中引用而不能在SELECT语句中引用。但是,当您在SELECT中生成变量时,您需要嵌套此步骤:

     DROP TABLE IF EXISTS temp_nielsen_other_upc_all_markets_test;

       CREATE TABLE IF NOT EXISTS temp_nielsen_other_upc_all_markets_test
       PARTITIONED BY (all_periods_int)
       STORED AS PARQUET
       AS
        SELECT
       all_markets
       , all_periods
, dollars
FROM
(
SELECT          
                    COALESCE(gap.all_markets,fct.all_markets) AS all_markets,
                    COALESCE(gap.all_periods, fct.all_periods) AS all_periods,
                    NVL(gap.dollar,0) - NVL(fct.dollar,0) AS dollar,
                    (CAST(SUBSTR(COALESCE(gap.all_periods,fct.all_periods),7,2) AS int)+2000)*10000+CAST(SUBSTR(COALESCE(gap.all_periods,fct.all_periods),1,2) AS int)*100+CAST(SUBSTR(COALESCE(gap.all_periods,fct.all_periods),4,2) AS int) AS all_periods_int
FROM                temp_nielsen_gap_total_all_markets_raw gap
FULL OUTER JOIN(

                    SELECT
                                        all_markets,
                                        all_periods,
                                        SUM(dollar) AS dollar
                    FROM                temp_nielsen_sku_facts_all_markets_raw
                    GROUP BY            all_markets,
                                        all_periods
                ) fct
ON(                 gap.all_markets = fct.all_markets AND
                    gap.all_periods = fct.all_periods
    )
WHERE           ABS(NVL(gap.dollar,0) - NVL(fct.dollar,0)) > 1 AND
                fct.all_markets in (SELECT DISTINCT all_markets FROM temp_nielsen_gap_total_all_markets_raw)) t_final;