我有一个用例,我有一个表a。我想从中选择数据,按字段分组,进行一些聚合并将结果插入到另一个hive表b中,其中一列作为结构。我面临一些困难。有人可以帮忙告诉我我的疑问有什么不对。
CREATE EXTERNAL TABLE IF NOT EXISTS a (
date string,
acct string,
media string,
id1 string,
val INT
) PARTITIONED BY (day STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION 'folder1/folder2/';
ALTER TABLE a ADD IF NOT EXISTS PARTITION (day='{DATE}') LOCATION 'folder1/folder2/Date={DATE}';
CREATE EXTERNAL TABLE IF NOT EXISTS b (
date string,
acct string,
media string,
st1 STRUCT<id1:STRING, val:INT>
) PARTITIONED BY (day STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION 'path/';
FROM a
INSERT OVERWRITE TABLE b PARTITION (day='{DATE}')
SELECT date,acct,media,named_struct('id1',id1,'val',sum(val))
WHERE day='{DATE}' and media is not null and acct is not null and NOT (id1 = "0" )
GROUP BY date,acct,media,id1;
我得到的错误:
SemanticException [Error 10044]: Line 3:31 Cannot insert into target table because column number/types are different ''2015-07-16'': Cannot convert column 4 from struct<id1:string,val:bigint> to struct<id1:string,val:int>.
答案 0 :(得分:1)
Sum返回BIGINT,而不是INT。所以声明
st1 STRUCT<id1:STRING, val:BIGINT>
而不是
st1 STRUCT<id1:STRING, val:INT>