我得到了以下具有嵌套结构的JSON文档格式
{
"id": "p-1234-2132321-213213213-12312",
"name": "athena to the rescue",
"groups": [
{
"strategy_group": "anyOf",
"conditions": [
{
"strategy_conditions": "anyOf",
"entries": [
{
"c_key": "service",
"C_operation": "isOneOf",
"C_value": "mambo,bambo,jumbo"
},
{
"c_key": "hostname",
"C_operation": "is",
"C_value": "lols"
}
]
}
]
}
],
"tags": [
"aaa",
"bbb",
"ccc"
]
}
我已经在雅典娜中创建了表格,以使用以下内容支持
CREATE EXTERNAL TABLE IF NOT EXISTS filters ( id string, name string, tags array<string>, groups array<struct<
strategy_group:string,
conditions:array<struct<
strategy_conditions:string,
entries: array<struct<
c_key:string,
c_operation:string,
c_value:string
>>
>>
>> ) row format serde 'org.openx.data.jsonserde.JsonSerDe' location 's3://filterios/policies/';
我目前的目标是也根据条件条目列进行查询。我已经尝试了一些查询,但是sql语言不是我最大的交易;)
此刻我得到了这个查询,该查询为我提供了
select cnds.entries from
filters,
UNNEST(filters.groups) AS t(grps),
UNNEST(grps.conditions) AS t(cnds)
但是,由于这是一个复杂的数组,它使我有些头疼,这是查询的正确方法。
任何提示表示赞赏!
谢谢 R
答案 0 :(得分:0)
我不确定我是否理解您的查询。看下面的这个例子,也许对您有用。
select id, name, tags,
grps.strategy_group,
cnds.strategy_conditions,
enes.c_key,enes.c_operation, enes.c_value from
filters,
UNNEST(filters.groups) AS t(grps),
UNNEST(grps.conditions) AS t(cnds),
UNNEST(cnds.entries) AS t(enes)
where enes.c_key='service'
答案 1 :(得分:0)
以下是我最近处理过的一个示例,可能会有所帮助:
我的JSON:
{
"type": "FeatureCollection",
"features": [{
"first": "raj",
"geometry": {
"type": "Point",
"coordinates": [-117.06861096, 32.57889962]
},
"properties": "someprop"
}]
}
创建的外部表:
CREATE EXTERNAL TABLE `jsondata`(
`type` string COMMENT 'from deserializer',
`features` array<struct<type:string,geometry:struct<type:string,coordinates:array<string>>>> COMMENT 'from deserializer')
ROW FORMAT SERDE
'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'paths'='features,type')
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
's3://vicinitycheck/rawData/jsondata/'
TBLPROPERTIES (
'classification'='json')
查询数据:
SELECT type AS TypeEvent,
features[1].geometry.coordinates AS FeatherType
FROM test_vicinitycheck.jsondata
WHERE type = 'FeatureCollection'
test_vicinitycheck-我的数据库名称在Athena中吗
jsondata-雅典娜中的表名
如果有帮助,我在博客上记录了一些示例: http://weavetoconnect.com/aws-athena-and-nested-json/