因此,我尝试将概述统计信息计算为JSON,但是在将它们争论到查询中时遇到了问题。
有2个表:
appointments
- time timestamp
- patients int
assignments
- user_id int
- appointment_id int
我想按用户计算当天的小时数。理想情况下,它看起来像这样:
[
{hour: "2015-07-01T08:00:00.000Z", assignments: [
{user_id: 123, patients: 3},
{user_id: 456, patients: 10},
{user_id: 789, patients: 4},
]},
{hour: "2015-07-01T09:00:00.000Z", assignments: [
{user_id: 456, patients: 1},
{user_id: 789, patients: 6}
]},
{hour: "2015-07-01T10:00:00.000Z", assignments: []}
...
]
我有点亲近:
with assignments_totals as (
select user_id,sum(patients),date_trunc('hour',appointments.time) as hour
from assignments
inner join appointments on appointments.id = assignments.appointment_id
group by date_trunc('hour',sales.time),user_id
), hours as (
select to_char(date_trunc('hour',time),'YYYY-MM-DD"T"HH24:00:00.000Z') as hour, array_to_json(array_agg(DISTINCT assignment_totals)) as patients
from appointments
left join assignment_totals on date_trunc('hour',sales.time) = assignment_totals.hour
where time >= '2015-07-01T07:00:00.000Z' and time < '2015-07-02T07:00:00.000Z'
group by date_trunc('hour',time)
order by date_trunc('hour',time)
)
select array_to_json(array_agg(hours)) as hours from hours;
哪个输出:
[
{hour: "2015-07-01T08:00:00.000Z", assignments: [
{user_id: 123, patients: 3, hour: "2015-07-01T08:00:00.000Z" },
{user_id: 456, patients: 10, hour: "2015-07-01T08:00:00.000Z"},
{user_id: 789, patients: 4, hour: "2015-07-01T08:00:00.000Z"},
]},
{hour: "2015-07-01T09:00:00.000Z", assignments: [
{user_id: 456, patients: 1, hour: "2015-07-01T09:00:00.000Z"},
{user_id: 789, patients: 6, hour: "2015-07-01T09:00:00.000Z"}
]},
{hour: "2015-07-01T10:00:00.000Z", assignments: [null]}
...
]
虽然这有效,但有两个问题,可能相互独立,也可能不相互独立:
我想做类似
的事情 hours as (
select to_char(date_trunc('hour',time),'YYYY-MM-DD"T"HH24:00:00.000Z') as hour, sum(appointments.patients) OVER(partition by assignments.user_id) as appointments
from appointments
left join assignments on appointments.id = assignments.appointment_id
where time >= '2015-07-01T07:00:00.000Z' and time < '2015-07-02T07:00:00.000Z'
group by date_trunc('hour',time)
order by date_trunc('hour',time)
)
select array_to_json(array_agg(hours)) as hours from hours
但是如果不给我一个&#34;属性必须在group by或者聚合函数错误中,我就无法工作。
任何人都知道如何解决这些问题?提前谢谢!
答案 0 :(得分:0)
您上次查询的主要问题似乎是将window functions与aggregate functions混为一谈。窗口函数使用OVER
语法,当GROUP BY
子句中有其他字段时,它们本身不需要SELECT
。另一方面,当GROUP BY
子句中存在其他(非聚合函数)字段时,聚合函数使用SELECT
。这种差异的一个实际结果是窗口函数不会自动DISTINCT
。
窗口函数产生的NULL
值的问题可以通过简单的COALESCE
来解决,这样就可以使用零而不是null。
因此,要使用窗口函数编写查询,请使用以下内容:
WITH hours AS
(
SELECT DISTINCT to_char(date_trunc('hour', ap.time), 'YYYY-MM-DD"T"HH:00:00.000Z') AS hour,
COALESCE(SUM(ap.patients) OVER (PARTITION BY asgn.user_id), 0) AS appointment_count
FROM appointments ap
LEFT JOIN assignments asgn ON ap.id = asgn.appointment_id
WHERE ap.time >= '2015-07-01T07:00:00.000Z'
AND ap.time < '2015-07-02T07:00:00.000Z'
)
SELECT array_to_json(array_agg(hours)) AS hours
FROM hours
ORDER BY hour
使用聚合函数:
WITH hours AS
(
SELECT to_char(date_trunc('hour', ap.time), 'YYYY-MM-DD"T"HH:00:00.000Z') AS hour,
SUM(COALESCE(ap.patients, 0)) AS appointment_count,
asgn.user_id
FROM appointments ap
LEFT JOIN assignments asgn ON ap.id = asgn.appointment_id
WHERE ap.time >= '2015-07-01T07:00:00.000Z'
AND ap.time < '2015-07-02T07:00:00.000Z'
GROUP BY asgn.user_id, to_char(date_trunc('hour', ap.time), 'YYYY-MM-DD"T"HH:00:00.000Z')
)
SELECT array_to_json(array_agg(hours)) AS hours
FROM hours
ORDER BY hour
我的语法可能不太正确,所以在使用此解决方案之前要仔细检查一下(并随意编辑以纠正任何错误)。
答案 1 :(得分:0)
我对此感到非常沮丧,因为我没有看过Postgres 9.4文档,该文档具有处理json的新功能。
我找到的解决方案建立在原始查询的基础上,但随后使用json_array_elements打破了赋值数组,使用where过滤器,然后再次重新构建它。基本上似乎毫无意义:
json_agg(json_array_elements(json_agg(*)))
但它的性能差异很小,让我得到了我需要去的地方。如果您找到更好的解决方案,请随时发表评论!它也应该在&lt; 9.4中使用array_agg和unexst但我遇到了麻烦,因为我试图取消从CTE返回的记录类型,而不是具有列定义的实际行类型。
with assignment_totals as (
select
date_trunc('hour',appointments.time) as hour,
user_id,
coalesce(sum(patients),0) as patients
from appointments
left outer join assignments on appointment.id = assignments.appointment_id
where time >= '2015-07-01T07:00:00.000Z' and time < '2015-07-02T07:00:00.000Z'
group by date_trunc('hour',appointments.time),user_id
), hours as (
select
to_char(assignment_totals.hour,'YYYY-MM-DD"T"HH24:00:00.000Z') as hour,
(
select coalesce(json_agg(json_build_object('user_id',(t->'user_id'),'patients',(t->'patients')) order by (t->>'user_id')),'[]'::json)
from json_array_elements(json_agg(assignment_totals)) t
where (t->>'patients') != '0'
) as patients
from assignment_totals
group by assignment_totals.hour
order by assignment_totals.hour
)
select array_to_json(array_agg(hours)) as hours from hours
感谢Andrew指出我可以将空值合并为0.但是我仍然希望过滤掉患者= 0的条目。这样可以解决我所有的问题,让我能够用一个地方过滤它们,然后让我能够通过使用json_build_object构建一个新的json对象来节省时间。