我正在使用Presto,并尝试从嵌套的json结构中从'source'='dd'提取所有'id',如下所示。
SELECT name from (
SELECT name, row_number() over (partition by name) as RN from TEST group by test having count(Name) >= 1 ) a where RN <= 2
期望将ID [apple1,apple3]提取到Presto中的一列中 我想知道在Presto Query中实现此目标的正确方法是什么?
答案 0 :(得分:2)
如果数据具有发布示例中所示的规则结构,则可以使用parsing the value as JSON的组合,将其强制转换为结构化的SQL类型(数组/地图/行),并使用{{3} }到array processing functions,filter并提取所需的元素:
WITH data(value) AS (VALUES '{
"results": [
{
"docs": [
{
"id": "apple1",
"source": "dd"
},
{
"id": "apple2",
"source": "aa"
},
{
"id": "apple3",
"source": "dd"
}
],
"group": 99806
}
]
}'),
parsed(value) AS (
SELECT cast(json_parse(value) AS row(results array(row(docs array(row(id varchar, source varchar)), "group" bigint))))
FROM data
)
SELECT
transform( -- extract the id from the resulting docs
filter( -- filter docs with source = 'dd'
flatten( -- flatten all docs arrays into a single doc array
transform(value.results, r -> r.docs) -- extract the docs arrays from the result array
),
doc -> doc.source = 'dd'),
doc -> doc.id)
FROM parsed
上面的查询产生:
_col0
------------------
[apple1, apple3]
(1 row)