我有以下查询,该查询从JSON对象(facebook_results
数据类型的json
Postgres 10列)中提取几列数据。
有时该对象中的数组包含10,000多个项目。
此操作的目的是从对象的每一列中获取非规范化数据的平面图,并且在有数组的情况下,我也想获取其中包含对象的所有列(显然只是向下复制数据)用于外键)。
最里面的键都不包含数组,因此我不必担心。我只关心matches
和nodes
数组,它们应该被“扩展”。
现在查询可以工作了,但是非常慢。我假设这是因为递归的查询性能不佳或递归或不必要的复杂性降低。
SELECT
id AS slice_id,
json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'size' AS match_size,
json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'score' AS match_score,
json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'width' AS match_width,
json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'format' AS match_format,
json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'domain' AS match_domain,
json_array_elements(json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'nodes') -> 'table' -> 'crawl_date' AS node_crawl_date,
json_array_elements(json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'nodes') -> 'table' -> 'url' AS node_url
FROM slices
WHERE id = 169
下面是facebook_results
列中包含的内容的一个示例:
{
"table":{
"matches": [
{
"table":{
"nodes":[
{
"table":{
"crawl_date":"2013-06-21",
"url":"http://example.com"
}
}
],
"size":7962624,
"score":47.059,
"width":3456,
"format":"MP4",
"domain":"example.com"
}
}
]
}
}
有人知道我该如何优化吗?
答案 0 :(得分:3)
您可以使用LATERAL
重写查询:
SELECT
id AS slice_id,
s.t -> 'size' AS match_size,
s.t -> 'score' AS match_score,
s.t -> 'width' AS match_width,
s.t -> 'format' AS match_format,
s.t -> 'domain' AS match_domain,
s.t2-> 'crawl_date' AS node_crawl_date,
s.t2-> 'url' AS node_url
FROM slices
,LATERAL (
SELECT json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table',
json_array_elements(json_array_elements(facebook_results -> 'table' -> 'matches')
-> 'table' -> 'nodes') -> 'table') s(t,t2)
WHERE id = 169;
或更短:
SELECT
id AS slice_id,
s.t -> 'size' AS match_size,
s.t -> 'score' AS match_score,
s.t -> 'width' AS match_width,
s.t -> 'format' AS match_format,
s.t -> 'domain' AS match_domain,
s2.t2 -> 'crawl_date' AS node_crawl_date,
s2.t2 -> 'url' AS node_url
FROM slices
,LATERAL(SELECT
json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' ) s(t)
,LATERAL(SELECT json_array_elements(s.t -> 'nodes') -> 'table') s2(t2)
WHERE id = 169;