Hive Version 2.1.1
问题描述:集合项终止值作为地图键插入
Hive表:
CREATE TABLE profiles(
id int,
name struct<first_name: string, middle_name: string, last_name: string>,
phone struct<home: string, office: string>,
address map<string,struct<streat:string, appartment:int, zip:string>>
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
COLLECTION ITEMS TERMINATED BY '-'
MAP KEYS TERMINATED BY '='
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;
数据:
1000,Suresh--S,1234567890-1234567890,home=Venkatapuram1-2020-500001
1001,Mahesh-X-M,1234567890-1234567890,home=Venkatapuram2-2021-500001
数据加载:
load data inpath '/handson/profiles_data.txt' overwrite into table profiles;
select语句中的实际数据:
SELECT * FROM profiles;
1000
{"first_name":"Suresh","middle_name":"","last_name":"S"}
{"home":"1234567890","office":"1234567890"}
{"home":
{"streat":"Venkatapuram1",**"appartment":null,"zip":null},"2020":null,
"500001": null}
1001
{"first_name":"Mahesh","middle_name":"X","last_name":"M"}
{"home":"1234567890","office":"1234567890"}
{"home":
{"streat":"Venkatapuram2",**"appartment":null,"zip":null},"2021":null,
"500001": null}
预期:
1000
{"first_name":"Suresh","middle_name":"","last_name":"S"}
{"home":"1234567890","office":"1234567890"}
{"home":{"streat":"Venkatapuram1",**"appartment":2020,"zip":"500001"}**}
1001
{"first_name":"Mahesh","middle_name":"X","last_name":"M"}
{"home":"1234567890","office":"1234567890"}
{"home": {"streat":"Venkatapuram2",**"appartment":2021,"zip":"500001"**}}
答案 0 :(得分:0)
正如在HIVE nested ARRAY in MAP data type中回答的那样,你只能覆盖hive中的前三个分隔符,而hive实际上支持8.在嵌套数据结构中,对于每个嵌套级别,使用一个后续分隔符。
在你的hive表中,address
映射中的结构中字段之间的分隔符是\ u004(Unicode 4),它不能被覆盖。
您应该将输入更改为:
1000,Suresh--S,1234567890-1234567890,home=Venkatapuram1\u00042020\u0004500001
1001,Mahesh-X-M,1234567890-1234567890,home=Venkatapuram2\u00042021\u0004500001