我有以下架构代表一个简单的指标存储:
CREATE TABLE targets (
target varchar
);
CREATE TABLE reads (
at timestamp without time zone,
target varchar
);
CREATE TABLE updates (
at timestamp without time zone,
target varchar
);
关系reads
和updates
会在特定时间存储特定目标上的事件。
这些是相同的样本数据:
COPY targets (target) FROM stdin;
A
B
C
\.
COPY reads (at, target) FROM stdin;
1970-01-01 03:40:00 A
1970-01-01 06:00:00 B
1970-01-01 05:00:00 A
1970-01-03 05:00:00 A
1970-01-04 01:00:00 B
\.
COPY updates (at, target) FROM stdin;
1970-01-01 01:00:00 A
1970-01-01 01:00:00 B
1970-01-01 02:00:00 A
1970-01-01 04:00:00 A
1970-01-02 01:00:00 A
1970-01-02 01:00:00 B
1970-01-04 01:00:00 B
\.
我会得到一份报告,其中包含计算每个目标的按日期排列的所有指标,类似于以下查询(最终也没有"零"行),但是以更有效的方式:
select t.target, day::date,
coalesce((select count(*) from updates where target = t.target and at::date = day), 0) updates,
coalesce((select count(*) from reads where target = t.target and at::date = day), 0) reads
from
generate_series('1970-01-01'::date, '1970-01-04'::date, '1 day'::interval) day,
targets t
order by target, day;
target | day | updates | reads
--------+------------+---------+-------
A | 1970-01-01 | 3 | 2
A | 1970-01-02 | 1 | 0
A | 1970-01-03 | 0 | 1
A | 1970-01-04 | 0 | 0
B | 1970-01-01 | 1 | 1
B | 1970-01-02 | 1 | 0
B | 1970-01-03 | 0 | 0
B | 1970-01-04 | 1 | 1
C | 1970-01-01 | 0 | 0
C | 1970-01-02 | 0 | 0
C | 1970-01-03 | 0 | 0
C | 1970-01-04 | 0 | 0
有什么建议吗?
答案 0 :(得分:1)
您可以使用FULL JOIN
对进行计数的子查询进行解决:
SELECT target, day, updates, reads
FROM (
SELECT target, at::date AS day, count(*) AS updates FROM updates GROUP BY 1, 2
) num_updates
FULL JOIN (
SELECT target, at::date AS day, count(*) AS reads FROM reads GROUP BY 1, 2
) num_reads USING (target, day)
WHERE day BETWEEN '1970-01-01'::date AND '1970-01-04'::date
ORDER BY 1, 2;
这不会为updates
和reads
以及NULL
而不是0
生成任何包含0值的行:
target | day | updates | reads
--------+------------+---------+-------
A | 1970-01-01 | 3 | 2
A | 1970-01-02 | 1 |
A | 1970-01-03 | | 1
B | 1970-01-01 | 1 | 1
B | 1970-01-02 | 1 |
B | 1970-01-04 | 1 | 1
如果您确实需要0
但不想同时包含updates = 0 AND reads = 0
的行,请在选择列表的两个列上执行简单的coalesce()
:
SELECT target, day, coalesce(updates, 0) AS updates, coalesce(reads, 0) AS reads
...
如果你想要加倍NULL
或0
,那么你应该generate_series()
日期范围JOIN targets
不合格完整的笛卡尔积,然后LEFT JOIN
子查询:
SELECT target, day, updates, reads
FROM generate_series('1970-01-01'::date, '1970-01-04'::date, interval '1 day') d(day)
JOIN targets
LEFT JOIN (
SELECT target, at::date AS day, count(*) AS updates FROM updates GROUP BY 1, 2
) num_updates USING (target, day)
LEFT JOIN (
SELECT target, at::date AS day, count(*) AS reads FROM reads GROUP BY 1, 2
) num_reads USING (target, day)
ORDER BY 1, 2;