选择组中的第一行,其中有多列定义该组

时间:2018-02-12 18:02:08

标签: sql postgresql greatest-n-per-group

这是一个虚拟表来描述我想要做的事情:

ID_1      | ID_2     | ID_3       | Day   | Energy_Costs  |
----------+----------+------------+-------+---------------+
State_1   | County_1 | Building_1 |  1    | 48.8          |
State_1   | County_1 | Building_1 |  2    | 31.3          |
State_1   | County_1 | Building_2 |  1    | 20.5          |
State_1   | County_2 | Building_1 |  1    |  1.9          |
State_2   | County_1 | Building_1 |  1    |  6.6          |
State_2   | County_2 | Building_2 |  1    | 38.2          |
State_2   | County_2 | Building_2 |  2    | 12.0          |

在上表中,唯一记录(本例中为建筑物)需要3列(ID_1,ID_2,ID_3)。我想返回一个表格,其中包含给定日期的第一行建筑物。

以下是查询在我脑海中的表现:

SELECT FIRST(ID_1), FIRST(ID_2), FIRST(ID_3), FIRST(Energy_Costs), FIRST(DAY)
FROM buildings_db
GROUP BY ID_1, ID_2, ID_3
ORDER BY DAY

这将返回:

ID_1      | ID_2     | ID_3       | Day   | Energy_Costs  |
----------+----------+------------+-------+---------------+
State_1   | County_1 | Building_1 |  1    | 48.8          |
State_1   | County_1 | Building_2 |  1    | 20.5          |
State_1   | County_2 | Building_1 |  1    |  1.9          |
State_2   | County_1 | Building_1 |  1    |  6.6          |
State_2   | County_2 | Building_1 |  1    | 38.2          |

我已经看到了其他类似问题的问题,但他们通常没有多列定义组。我对SQL很陌生,所以将它们翻译成我的例子证明是不成功的;如果您有任何人可以提供解释为什么您的解决方案有效,那将非常有用。

2 个答案:

答案 0 :(得分:1)

您可以使用DISTINCT ON ()。它适用于任意数量的列来定义组:

SELECT DISTINCT ON (ID_1, ID_2, ID_3)
       ID_1, ID_2, ID_3, DAY, Energy_Costs
FROM   buildings_db
ORDER  BY ID_1, ID_2, ID_3, DAY, Energy_Costs;

这将返回(ID_1, ID_2, ID_3)的每个不同组合的第一行,首先由其他ORDER BY表达式定义。

要得到......

  

建筑物给定日期的第一行:

SELECT DISTINCT ON (ID_1, ID_2, ID_3)
       ID_1, ID_2, ID_3, DAY, Energy_Costs
FROM   buildings_db
WHERE  DAY = 1  -- given day
ORDER  BY ID_1, ID_2, ID_3, Energy_Costs

详细说明:

答案 1 :(得分:0)

您可以使用子查询,JOIN用于此

select b.ID_1, b.ID_2, b.ID_3, b.Energy_Costs, b.DAY
from buildings_db b
join
(
  select ID_1, ID_2, ID_3, min(day) min_day
  from buildings_db 
  group by ID_1, ID_2, ID_3
) t on b.id_1 = t.id_1 and
       b.id_2 = t.id_2 and
       b.id_2 = t.id_2 and
       b.day = t.min_day