Postgres:如何优化使用多个json_array_elements()调用的查询

时间:2018-08-28 03:12:46

标签: postgresql

我有以下查询,该查询从JSON对象(facebook_results数据类型的json Postgres 10列)中提取几列数据。

有时该对象中的数组包含10,000多个项目。

此操作的目的是从对象的每一列中获取非规范化数据的平面图,并且在有数组的情况下,我也想获取其中包含对象的所有列(显然只是向下复制数据)用于外键)。

最里面的键都不包含数组,因此我不必担心。我只关心matchesnodes数组,它们应该被“扩展”。

现在查询可以工作了,但是非常慢。我假设这是因为递归的查询性能不佳或递归或不必要的复杂性降低。

SELECT
  id AS slice_id,
  json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'size'       AS match_size,
  json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'score'      AS match_score,
  json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'width'      AS match_width,
  json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'format'     AS match_format,
  json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'domain'     AS match_domain,
  json_array_elements(json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'nodes') -> 'table' -> 'crawl_date' AS node_crawl_date,
  json_array_elements(json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'nodes') -> 'table' -> 'url'        AS node_url
FROM slices
WHERE id = 169

下面是facebook_results列中包含的内容的一个示例:

{
  "table":{
    "matches": [
      {  
        "table":{  
          "nodes":[  
            {  
              "table":{  
                "crawl_date":"2013-06-21",
                "url":"http://example.com"
              }
            }
          ],
          "size":7962624,
          "score":47.059,
          "width":3456,
          "format":"MP4",
          "domain":"example.com"
        }
      }
    ]
  }
}

有人知道我该如何优化吗?

1 个答案:

答案 0 :(得分:3)

您可以使用LATERAL重写查询:

SELECT
  id AS slice_id,
  s.t -> 'size'       AS match_size,
  s.t -> 'score'      AS match_score,
  s.t -> 'width'      AS match_width,
  s.t -> 'format'     AS match_format,
  s.t -> 'domain'     AS match_domain,
  s.t2-> 'crawl_date' AS node_crawl_date,
  s.t2-> 'url'        AS node_url
FROM slices
,LATERAL (
SELECT json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table',
json_array_elements(json_array_elements(facebook_results -> 'table' -> 'matches') 
           -> 'table' -> 'nodes') -> 'table') s(t,t2)
WHERE id = 169;

DBFiddle Demo

或更短:

SELECT
  id AS slice_id,
  s.t   -> 'size'       AS match_size,
  s.t   -> 'score'      AS match_score,
  s.t   -> 'width'      AS match_width,
  s.t   -> 'format'     AS match_format,
  s.t   -> 'domain'     AS match_domain,
  s2.t2 -> 'crawl_date' AS node_crawl_date,
  s2.t2 -> 'url'        AS node_url
FROM slices
,LATERAL(SELECT 
  json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' ) s(t)
,LATERAL(SELECT json_array_elements(s.t -> 'nodes') -> 'table') s2(t2)
WHERE id = 169;

DBFiddle Demo2