Hive从嵌套数组中提取数据

时间:2017-03-19 10:50:41

标签: arrays json amazon-web-services hive amazon-athena

您需要从数组中提取数据,我正在使用Athena

create external table test
(
customer string
)
Location 'something-something'

此表的单行为

select * from customer limit 1

{ "ID": "XXXX", "USerDate": { "items": [{ "Name": "Nir", "CLG": "NPT", "Place": "CBE", "Any Group": {}, "Interest": { "items": [{ "Games": "Cricket", "Music": "AR" }] }, "Others": {} }] } }

我需要像

一样提取行

| ID |名称|放置|游戏|音乐|

| ----- | --------- | ---------- | ---------- | -------- --- |

1 个答案:

答案 0 :(得分:0)

select  json_extract_scalar(customer,'$.ID')    as ID
       ,json_extract_scalar(i1.item,'$.Name')   as Name
       ,json_extract_scalar(i1.item,'$.Place')  as Place
       ,json_extract_scalar(i2.item,'$.Games')  as Games
       ,json_extract_scalar(i2.item,'$.Music')  as Music

from    test

        cross join unnest (cast(json_extract(customer,'$.USerDate.items') 
            as array(json))) as i1 (item)

        cross join unnest (cast(json_extract(i1.item,'$.Interest.items')
            as array(json))) as i2 (item)
;
  ID  | Name | Place |  Games  | Music
------+------+-------+---------+-------
 XXXX | Nir  | CBE   | Cricket | AR