我是Postgres的新手,来自MySQL并希望你们中的一个人能够帮助我。
我有一个包含三列的表:name
,week
和value
。此表记录了名称,记录高度的周数以及高度值。
像这样:
Name | Week | Value
------+--------+-------
John | 1 | 9
Cassie| 2 | 5
Luke | 6 | 3
John | 8 | 14
Cassie| 5 | 7
Luke | 9 | 5
John | 2 | 10
Cassie| 4 | 4
Luke | 7 | 4
我想要的是每个用户在最小周和最大周的值的列表。像这样:
Name |minWeek | Value |maxWeek | value
------+--------+-------+--------+-------
John | 1 | 9 | 8 | 14
Cassie| 2 | 5 | 5 | 7
Luke | 6 | 3 | 9 | 5
在Postgres中,我使用此查询:
select name, week, value
from table t
inner join(
select name, min(week) as minweek
from table
group by name)
ss on t.name = ss.name and t.week = ss.minweek
group by t.name
;
但是,我收到错误:
列“w.week”必须出现在GROUP BY子句中或用于聚合函数
位置:20
这对我来说在MySQL中运行正常,所以我想知道我在这里做错了什么?
答案 0 :(得分:11)
有各种更简单,更快捷的方式。
DISTINCT ON
SELECT *
FROM (
SELECT DISTINCT ON (name)
name, week AS first_week, value AS first_val
FROM tbl
ORDER BY name, week
) f
JOIN (
SELECT DISTINCT ON (name)
name, week AS last_week, value AS last_val
FROM tbl
ORDER BY name, week DESC
) l USING (name);
或更短:
SELECT *
FROM (SELECT DISTINCT ON (1) name, week AS first_week, value AS first_val
FROM tbl ORDER BY 1,2) f
JOIN (SELECT DISTINCT ON (1) name, week AS last_week, value AS last_val
FROM tbl ORDER BY 1,2 DESC) l USING (name);
简单易懂。在我的测试中也是最快的。 DISTINCT ON
的详细说明:
first_value()
复合类型 aggregate functions min()
or max()
不接受复合类型作为输入。您必须创建自定义聚合函数(这并不难)
但window functions first_value()
and last_value()
执行。在此基础上,我们可以设计一个非常简单的解决方案:
SELECT DISTINCT ON (name)
name, week AS first_week, value AS first_value
,(first_value((week, value)) OVER (PARTITION BY name
ORDER BY week DESC))::text AS l
FROM tbl t
ORDER BY name, week;
输出包含所有数据,但上周的值被填充到匿名记录中。您可能需要分解值。
为此,我们需要一个众所周知的类型,用系统注册所包含元素的类型。适应的表定义将允许直接机会使用表类型本身:
CREATE TABLE tbl (week int, value int, name text) -- note optimized column order
week
和value
排在第一位。
SELECT (l).name, first_week, first_val
, (l).week AS last_week, (l).value AS last_val
FROM (
SELECT DISTINCT ON (name)
week AS first_week, value AS first_val
,first_value(t) OVER (PARTITION BY name ORDER BY week DESC) AS l
FROM tbl t
ORDER BY name, week
) sub;
然而,在大多数情况下,这可能是不可能的。只需使用CREATE TYPE
(永久)或CREATE TEMP TABLE
中的用户定义类型(用于临时使用):
CREATE TEMP TABLE nv(last_week int, last_val int); -- register composite type
SELECT name, first_week, first_val, (l).last_week, (l).last_val
FROM (
SELECT DISTINCT ON (name)
name, week AS first_week, value AS first_val
,first_value((week, value)::nv) OVER (PARTITION BY name
ORDER BY week DESC) AS l
FROM tbl t
ORDER BY name, week
) sub;
在Postgres 9.3的本地测试中,有一个类似的50k行表,这些查询中的每个都比当前接受的答案快得多。使用EXPLAIN ANALYZE
进行测试。
SQL Fiddle显示全部。
答案 1 :(得分:6)
这有点痛苦,因为Postgres有很好的窗口函数first_value()
和last_value()
,但这些不是聚合函数。所以,这是一种方式:
select t.name, min(t.week) as minWeek, max(firstvalue) as firstvalue,
max(t.week) as maxWeek, max(lastvalue) as lastValue
from (select t.*, first_value(value) over (partition by name order by week) as firstvalue,
last_value(value) over (partition by name order by week) as lastvalue
from table t
) t
group by t.name;