HIVE - 使用WITH CLAUSE INSERT OVERWRITE

时间:2016-07-07 12:54:21

标签: hadoop hive

我有一个生成的查询以WITH子句开头,当我在控制台中运行它时工作正常,当我尝试使用INSERT OVERWRITE运行查询以将输出加载到单独的配置单元表

INSERT OVERWRITE TABLE $proc_db.$master_table PARTITION(created_dt, country) $master_query

它会抛出以下错误

cannot recognize input near 'WITH' 't' 'as' in statement

查询如下:

master_query="
WITH t
AS (
SELECT subscription_id
    ,country
    ,email_type
    ,email_priority
    ,created_dt
FROM crm_arrow.birthday
WHERE created_dt = '2016-07-07'
    AND (COUNTRY = 'SG')
GROUP BY subscription_id
    ,country
    ,email_type
    ,email_priority
    ,created_dt

UNION ALL

SELECT subscription_id
    ,country
    ,email_type
    ,email_priority
    ,created_dt
FROM crm_arrow.wishlist
WHERE created_dt = '2016-07-07'
    AND (COUNTRY = 'SG')
GROUP BY subscription_id
    ,country
    ,email_type
    ,email_priority
    ,created_dt

UNION ALL
.....
)
SELECT q.subscription_id
,q.country
,q.email_type
FROM (
SELECT t1.subscription_id
    ,t1.country
    ,DENSE_RANK() OVER (
        PARTITION BY t1.subscription_id
        ,t1.country ORDER BY t1.email_priority
        ) global_rank
    ,CASE 
        WHEN t1.email_type = t2.email_type
            THEN t1.email_type
        END email_type
FROM t t1
LEFT JOIN t t2 ON t1.country = t2.country
    AND t1.subscription_id = t2.subscription_id
) q
WHERE q.email_type IS NOT NULL
AND (
    q.global_rank <= 2
    AND country = 'SG'
    )
"

如何使用庞大的内部查询进行有效的自联接?我还尝试在master_query中包含select语句,但它仍然无法正常工作。

3 个答案:

答案 0 :(得分:6)

它就是你把INSERT语句放在哪里的问题。有关如何将INSERT与WITH子句

组合的示例,请参见此处
CREATE TABLE ramesh_test
(key          BIGINT,
 text_value   STRING,
 roman_value  STRING)
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY '\t' 
LINES TERMINATED BY '\n' 
STORED AS TEXTFILE;

WITH v_text
AS
(SELECT 1 AS key, 'One' AS value),
v_roman
AS
(SELECT 1 AS key, 'I' AS value)
INSERT OVERWRITE TABLE ramesh_test
SELECT v_text.key, v_text.value, v_roman.value
  FROM v_text JOIN v_roman
                ON (v_text.key = v_roman.key);

将INSERT置于主SELECT之上。

希望这有帮助!

答案 1 :(得分:2)

您需要将查询更改为类似的内容,以便INSERT OVERWRITE出现在查询中的SELECT q.subscription_id子句之前: -

请参阅此示例。在顶部使用1或多个,然后立即写入INSERT OVERWRITE,然后选择查询: -

WITH TABLE1 
AS
(
    SELECT 
    cod_index,
    CAST(test_1 AS VARCHAR(200)), 
    CAST(test_2 AS VARCHAR(200)), 
    CAST(test_3 AS VARCHAR(200))
    FROM db_h_gss.tb_h_test_orig
)
INSERT INTO TABLE db_h_gss.tb_h_test_insert PARTITION (cod_index = 1)
SELECT
    test_1,
    test_2,
    test_3
FROM TABLE1 WHERE cod_index = 1;

答案 2 :(得分:0)

假设您的大型查询确实有效,您只需删除WHERE T AS - 它不是有效的Hive语法,这就是错误告诉您的内容。

所以你的查询应该是

INSERT OVERWRITE TABLE $proc_db.$master_table PARTITION(created_dt, country)
SELECT subscription_id ...