我有一个从表中返回setof
的函数:
CREATE OR REPLACE FUNCTION get_assoc_addrs_from_bbl(_bbl text)
RETURNS SETOF wow_bldgs AS $$
SELECT bldgs.* FROM wow_bldgs AS bldgs
...
$$ LANGUAGE SQL STABLE;
这是表格将返回的样本:
现在我正在写一个"聚合"函数将只返回一行,该行包含有关此函数返回的表的各种(聚合)数据点。这是我当前的工作(和天真)示例:
SELECT
count(distinct registrationid) as bldgs,
sum(unitsres) as units,
round(avg(yearbuilt), 1) as age,
(SELECT first(corpname) FROM (
SELECT unnest(corpnames) as corpname
FROM get_assoc_addrs_from_bbl('3012380016')
GROUP BY corpname ORDER BY count(*) DESC LIMIT 1
) corps) as topcorp,
(SELECT first(businessaddr) FROM (
SELECT unnest(businessaddrs) as businessaddr
FROM get_assoc_addrs_from_bbl('3012380016')
GROUP BY businessaddr ORDER BY count(*) DESC LIMIT 1
) rbas) as topbusinessaddr
FROM get_assoc_addrs_from_bbl('3012380016') assocbldgs
正如您所看到的,对于两个"子查询"需要自定义分组/排序方法,我需要重复调用get_assoc_addrs_from_bbl()
。理想情况下,我正在寻找一种可以避免重复调用的结构,因为该函数需要大量处理,并且我希望能够容纳任意数量的子查询。我已经研究过CTE和窗口表达式等,但没有运气。
任何提示?谢谢!
答案 0 :(得分:0)
创建简单的aggregate function:
create aggregate array_agg2(anyarray) (
sfunc=array_cat,
stype=anyarray);
它将数组值聚合为一个单一的dim数组。例如:
# with t(x) as (values(array[1,2]),(array[2,3,4])) select array_agg2(x) from t;
┌─────────────┐
│ array_agg2 │
╞═════════════╡
│ {1,2,2,3,4} │
└─────────────┘
之后,您的查询可以重写为
SELECT
count(distinct registrationid) as bldgs,
sum(unitsres) as units,
round(avg(yearbuilt), 1) as age,
(SELECT first(corpname) FROM (
SELECT * FROM unnest(array_agg2(corpnames)) as corpname
GROUP BY corpname ORDER BY count(*) DESC LIMIT 1
) corps) as topcorp,
(SELECT first(businessaddr) FROM (
SELECT * FROM unnest(array_agg2(businessaddrs)) as businessaddr
GROUP BY businessaddr ORDER BY count(*) DESC LIMIT 1
) rbas) as topbusinessaddr
FROM get_assoc_addrs_from_bbl('3012380016') assocbldgs
(当然,如果我正确理解你的目标)