PostgreSQL - 来自同一函数调用的多个聚合查询

时间:2017-11-15 21:59:28

标签: sql postgresql

我有一个从表中返回setof的函数:

CREATE OR REPLACE FUNCTION get_assoc_addrs_from_bbl(_bbl text)
RETURNS SETOF wow_bldgs AS $$
  SELECT bldgs.* FROM wow_bldgs AS bldgs
  ...
$$ LANGUAGE SQL STABLE;

这是表格将返回的样本:

enter image description here

现在我正在写一个"聚合"函数将只返回一行,该行包含有关此函数返回的表的各种(聚合)数据点。这是我当前的工作(和天真)示例:

SELECT 
  count(distinct registrationid) as bldgs,
  sum(unitsres) as units,
  round(avg(yearbuilt), 1) as age,
  (SELECT first(corpname) FROM (
    SELECT unnest(corpnames) as corpname
    FROM get_assoc_addrs_from_bbl('3012380016')
    GROUP BY corpname ORDER BY count(*) DESC LIMIT 1
  ) corps) as topcorp,
  (SELECT first(businessaddr) FROM (
    SELECT unnest(businessaddrs) as businessaddr
    FROM get_assoc_addrs_from_bbl('3012380016')
    GROUP BY businessaddr ORDER BY count(*) DESC LIMIT 1
  ) rbas) as topbusinessaddr
FROM get_assoc_addrs_from_bbl('3012380016') assocbldgs

正如您所看到的,对于两个"子查询"需要自定义分组/排序方法,我需要重复调​​用get_assoc_addrs_from_bbl()。理想情况下,我正在寻找一种可以避免重复调用的结构,因为该函数需要大量处理,并且我希望能够容纳任意数量的子查询。我已经研究过CTE和窗口表达式等,但没有运气。

任何提示?谢谢!

1 个答案:

答案 0 :(得分:0)

创建简单的aggregate function

create aggregate array_agg2(anyarray) (
  sfunc=array_cat,
  stype=anyarray);

它将数组值聚合为一个单一的dim数组。例如:

# with t(x) as (values(array[1,2]),(array[2,3,4])) select array_agg2(x) from t;
┌─────────────┐
│ array_agg2  │
╞═════════════╡
│ {1,2,2,3,4} │
└─────────────┘

之后,您的查询可以重写为

SELECT 
  count(distinct registrationid) as bldgs,
  sum(unitsres) as units,
  round(avg(yearbuilt), 1) as age,
  (SELECT first(corpname) FROM (
    SELECT * FROM unnest(array_agg2(corpnames)) as corpname
    GROUP BY corpname ORDER BY count(*) DESC LIMIT 1
  ) corps) as topcorp,
  (SELECT first(businessaddr) FROM (
    SELECT * FROM unnest(array_agg2(businessaddrs)) as businessaddr
    GROUP BY businessaddr ORDER BY count(*) DESC LIMIT 1
  ) rbas) as topbusinessaddr
FROM get_assoc_addrs_from_bbl('3012380016') assocbldgs

(当然,如果我正确理解你的目标)