在查询中使用GROUP BY
语句和聚合函数时,如何从列中添加特定值?
这是我的桌子的一个例子:
id | year | quarter | wage | comp_id | comp_industry |
123 | 2012 | 1 | 1000 | 456 | abc |
123 | 2012 | 1 | 2000 | 789 | def |
123 | 2012 | 2 | 1500 | 789 | def |
456 | 2012 | 1 | 2000 | 321 | ghi |
456 | 2012 | 2 | 2000 | 321 | ghi |
要通过wage
和quarter
计算每个人的wage
值的总和,我运行以下查询:
SELECT SUM(wage) AS sum_wage
FROM t1
GROUP BY id, year, quarter, sum_wage;
结果
id | year | quarter | sum_wage |
123 | 2012 | 1 | 3000 |
123 | 2012 | 2 | 1500 |
456 | 2012 | 1 | 2000 |
456 | 2012 | 2 | 2000 |
我想更新查询,以包括comp_industry
列,其中wage
和quarter
的个人year
最高。我不确定从哪里开始,所以我只能返回人们每个quarter
和year
赚钱最多的行业。
id | year | quarter | sum_wage | comp_industry
123 | 2012 | 1 | 3000 | def
123 | 2012 | 2 | 1500 | def
456 | 2012 | 1 | 2000 | ghi
456 | 2012 | 2 | 2000 | ghi
我看过Get value based on max of a different column grouped by another column和Fetch the row which has the Max value for a column,但不确定从那里去哪里。
任何帮助或建议将不胜感激!
答案 0 :(得分:1)
您可以尝试将窗口功能与SUM
和ROW_NUMBER
一起使用。
按id
,year
,quarter
列的行数按wage
desc的顺序进行排序,然后得到rn = 1
。
模式(PostgreSQL v9.6)
CREATE TABLE T (
id INT,
year INT,
quarter INT,
wage INT,
comp_id INT,
comp_industry VARCHAR(50)
);
INSERT INTO T VALUES (123 , 2012 , 1 , 1000 , 456 ,'abc');
INSERT INTO T VALUES (123 , 2012 , 1 , 2000 , 789 ,'def');
INSERT INTO T VALUES (123 , 2012 , 2 , 1500 , 789 ,'def');
INSERT INTO T VALUES (456 , 2012 , 1 , 2000 , 321 ,'ghi');
INSERT INTO T VALUES (456 , 2012 , 2 , 2000 , 321 ,'ghi');
查询#1
SELECT id, year,quarter ,sum_wage, comp_industry FROM (
SELECT *,
SUM(wage) OVER (PARTITION BY id, year, quarter order by year ) sum_wage,
ROW_NUMBER() OVER (PARTITION BY id, year, quarter order by wage desc) rn
FROM T
) t1
where rn = 1;
| id | year | quarter | sum_wage | comp_industry |
| --- | ---- | ------- | -------- | ------------- |
| 123 | 2012 | 1 | 3000 | def |
| 123 | 2012 | 2 | 1500 | def |
| 456 | 2012 | 1 | 2000 | ghi |
| 456 | 2012 | 2 | 2000 | ghi |
答案 1 :(得分:1)
我不确定100%是否理解这个问题,这对您有帮助吗?
SELECT id,
year,
quarter,
comp_industry,
SUM(wage)
FROM (SELECT id,
year,
quarter,
comp_industry,
wage
FROM (SELECT TMP.*,
RANK() OVER
( PARTITION BY id,
year,
quarter
ORDER BY wage_sum DESC
) wage_rnk
FROM (SELECT t1.*,
SUM(wage) OVER
( PARTITION BY id,
year,
quarter
) wage_sum
FROM t1
GROUP BY id,
year,
quarter
) TMP
) TMP2
WHERE wage_rnk = 1
) TMP3
GROUP
BY id,
year,
quarter,
comp_industry;