蜂巢中的嵌套json

时间:2017-02-10 15:06:49

标签: sql hive

我有一个json列day_data,其中包含json格式的数据。如何使用Hive实现预期输出?

输入:

{"_id":"1","name":"abc","attribs":[{"minutes":0,"name":"sedentary"},{"minutes":0,"name":"lightly"},{"minutes":0,"name":"fairly"},{"minutes":28,"name":"very"}],"validated":true}

输出: id name attrib_minutes attrib_name validated 1 abc 0 sedentary true 1 abc 0 lightly true 1 abc 0 fairly true 1 abc 28 very true

我可以使用get_json_object命令提取id,name和验证字段, select get_json_object(day_data,'$._id') as id, get_json_object(day_data,'$.name') as name, get_json_object(day_data,'$.validated') as validated from temp_table;

如何提取嵌套的json属性(attrib_minutes和attrib_name)?

1 个答案:

答案 0 :(得分:3)

select  j.id
       ,j.name
       ,get_json_object  (day_data,concat('$.attribs[',e.i,'].minutes'))    as attrib_minutes
       ,get_json_object  (day_data,concat('$.attribs[',e.i,'].name'))       as attrib_name
       ,j.validated

from                    temp_table t
        lateral view    json_tuple  (day_data,'_id','name','validated')  j as id,name,validated
        lateral view    posexplode  (split(get_json_object (day_data,'$.attribs[*].name'),'","')) e as i,x
;
j.id    j.name  attrib_minutes  attrib_name j.validated
1   abc 0   sedentary   true
1   abc 0   lightly true
1   abc 0   fairly  true
1   abc 28  very    true