我有下表log
:
event_time | name | ------------------------- 2014-07-16 11:40 Bob 2014-07-16 10:00 John 2014-07-16 09:20 Bob 2014-07-16 08:20 Bob 2014-07-15 11:20 Bob 2014-07-15 10:20 John 2014-07-15 09:00 Bob
我想生成一个报告,我可以按照每天的条目数和入学日对数据进行分组。因此,上表的结果报告将是这样的:
event_date | 0-2 | 3 | 4-99 | ------------------------------- 2014-07-16 1 1 0 2014-07-15 2 0 0
我使用以下方法来解决它:
如果我在任何人发布之前找到答案,我会分享。
我想为每个name
计算一些日常条目。然后我检查这个值属于哪个列,并且我将1添加到该列。
答案 0 :(得分:2)
我分两步走了。内部查询获取基本计数。外部查询使用case语句对计数求和。
select event_date,
sum(case when cnt between 0 and 2 then 1 else 0 end) as "0-2",
sum(case when cnt = 3 then 1 else 0 end) as "3",
sum(case when cnt between 4 and 99 then 1 else 0 end) as "4-99"
from
(select cast(event_time as date) as event_date,
name,
count(1) as cnt
from log
group by cast(event_time as date), name) baseCnt
group by event_date
order by event_date
答案 1 :(得分:1)
试试这个
select da,sum(case when c<3 then 1 else 0 end) as "0-2",
sum(case when c=3 then 1 else 0 end) as "3",
sum(case when c>3 then 1 else 0 end) as "4-66" from (
select cast(event_time as date) as da,count(*) as c from
table1 group by cast(event_time as date),name) as aa group by da
答案 2 :(得分:1)
首先通过两个步骤进行聚合:
SELECT day, CASE
WHEN ct < 3 THEN '0-2'
WHEN ct > 3 THEN '4_or_more'
ELSE '3'
END AS cat
,count(*)::int AS val
FROM (
SELECT event_time::date AS day, count(*) AS ct
FROM tbl
GROUP BY 1
) sub
GROUP BY 1,2
ORDER BY 1,2;
根据您的描述,名称应完全不相关
然后接受查询并通过crosstab()
:
SELECT *
FROM crosstab(
$$SELECT day, CASE
WHEN ct < 3 THEN '0-2'
WHEN ct > 3 THEN '4_or_more'
ELSE '3'
END AS cat
,count(*)::int AS val
FROM (
SELECT event_time::date AS day, count(*) AS ct
FROM tbl
GROUP BY 1
) sub
GROUP BY 1,2
ORDER BY 1,2$$
,$$VALUES ('0-2'::text), ('3'), ('4_or_more')$$
) AS f (day date, "0-2" int, "3" int, "4_or_more" int);
crosstab()
由附加模块tablefunc
提供。此相关答案中的详细信息和说明:
PostgreSQL Crosstab Query
答案 3 :(得分:1)
这是PIVOT
查询的变体(尽管PostgreSQL通过crosstab(...)
table functions支持这一点)。现有的答案涵盖了基本技术,我更愿意在不使用CASE
的情况下构建查询。
要开始使用,我们需要做一些事情。第一个基本上是一个Calendar Table,或者来自一个的条目(如果你还没有,它们是最有用的维度表)。如果您没有,则可以轻松生成指定日期的条目:
WITH Calendar_Range AS (SELECT startOfDay, startOfDay + INTERVAL '1 DAY' AS nextDay
FROM GENERATE_SERIES(CAST('2014-07-01' AS DATE),
CAST('2014-08-01' AS DATE),
INTERVAL '1 DAY') AS dr(startOfDay))
这主要用于创建双聚合的第一步,如下所示:
SELECT Calendar_Range.startOfDay, COUNT(Log.name)
FROM Calendar_Range
LEFT JOIN Log
ON Log.event_time >= Calendar_Range.startOfDay
AND Log.event_time < Calendar_Range.nextDay
GROUP BY Calendar_Range.startOfDay, Log.name
请记住,大多数具有可空表达式的聚合列(此处为COUNT(Log.name)
)将忽略 null
值(不计算它们)。这也是不包含SELECT
列表中的分组列的少数几次之一(通常会使结果模糊不清)。对于实际的查询,我将它放入子查询中,但它也可以用作CTE。
我们还需要一种方法来构建我们的COUNT
范围。这也很简单:
Count_Range AS (SELECT text, start, LEAD(start) OVER(ORDER BY start) as next
FROM (VALUES('0 - 2', 0),
('3', 3),
('4+', 4)) e(text, start))
我们也会将这些视为“独家上限”。
我们现在拥有进行查询所需的所有部分。我们实际上可以使用这些虚拟表在当前答案的两个静脉中进行查询。
首先,SUM(CASE...)
风格
对于此查询,我们将再次利用聚合函数的零忽略质量:
WITH Calendar_Range AS (SELECT startOfDay, startOfDay + INTERVAL '1 DAY' AS nextDay
FROM GENERATE_SERIES(CAST('2014-07-14' AS DATE),
CAST('2014-07-17' AS DATE),
INTERVAL '1 DAY') AS dr(startOfDay)),
Count_Range AS (SELECT text, start, LEAD(start) OVER(ORDER BY start) as next
FROM (VALUES('0 - 2', 0),
('3', 3),
('4+', 4)) e(text, start))
SELECT startOfDay,
COUNT(Zero_To_Two.text) AS Zero_To_Two,
COUNT(Three.text) AS Three,
COUNT(Four_And_Up.text) AS Four_And_Up
FROM (SELECT Calendar_Range.startOfDay, COUNT(Log.name) AS count
FROM Calendar_Range
LEFT JOIN Log
ON Log.event_time >= Calendar_Range.startOfDay
AND Log.event_time < Calendar_Range.nextDay
GROUP BY Calendar_Range.startOfDay, Log.name) Entry_Count
LEFT JOIN Count_Range Zero_To_Two
ON Zero_To_Two.text = '0 - 2'
AND Entry_Count.count >= Zero_To_Two.start
AND Entry_Count.count < Zero_To_Two.next
LEFT JOIN Count_Range Three
ON Three.text = '3'
AND Entry_Count.count >= Three.start
AND Entry_Count.count < Three.next
LEFT JOIN Count_Range Four_And_Up
ON Four_And_Up.text = '4+'
AND Entry_Count.count >= Four_And_Up.start
GROUP BY startOfDay
ORDER BY startOfDay
另一个选项当然是crosstab
查询,其中CASE
用于细分结果。我们将使用Count_Range
表来解码我们的值:
SELECT startOfDay, "0 -2", "3", "4+"
FROM CROSSTAB($$WITH Calendar_Range AS (SELECT startOfDay, startOfDay + INTERVAL '1 DAY' AS nextDay
FROM GENERATE_SERIES(CAST('2014-07-14' AS DATE),
CAST('2014-07-17' AS DATE),
INTERVAL '1 DAY') AS dr(startOfDay)),
Count_Range AS (SELECT text, start, LEAD(start) OVER(ORDER BY start) as next
FROM (VALUES('0 - 2', 0),
('3', 3),
('4+', 4)) e(text, start))
SELECT Calendar_Range.startOfDay, Count_Range.text, COUNT(*) AS count
FROM (SELECT Calendar_Range.startOfDay, COUNT(Log.name) AS count
FROM Calendar_Range
LEFT JOIN Log
ON Log.event_time >= Calendar_Range.startOfDay
AND Log.event_time < Calendar_Range.nextDay
GROUP BY Calendar_Range.startOfDay, Log.name) Entry_Count
JOIN Count_Range
ON Entry_Count.count >= Count_Range.start
AND (Entry_Count.count < Count_Range.end OR Count_Range.end IS NULL)
GROUP BY Calendar_Range.startOfDay, Count_Range.text
ORDER BY Calendar_Range.startOfDay, Count_Range.text$$,
$$VALUES('0 - 2', '3', '4+')$$) Data(startOfDay DATE, "0 - 2" INT, "3" INT, "4+" INT)
(我相信这是正确的,但没有办法测试它 - Fiddle似乎没有加载交叉表功能。特别是,CTE可能必须进入函数内部本身,但我不确定......)