Question

我想返回给定范围单位的最后一份报告。最后一份报告将根据其创建时间确定。因此，结果将是给定范围单位的最后报告的集合。我不想使用一堆SELECT语句，例如：

SELECT * FROM reports WHERE unit_id = 9999 ORDER BY time desc LIMIT 1
SELECT * FROM reports WHERE unit_id = 9998 ORDER BY time desc LIMIT 1
...

我最初尝试过这个（但已经知道它不起作用，因为它只返回1个报告）：

'SELECT reports.* FROM reports INNER JOIN units ON reports.unit_id = units.id WHERE units.account_id IS NOT NULL AND units.account_id = 4 ORDER BY time desc LIMIT 1'

所以我正在寻找使用子查询或派生表的某种解决方案，但我似乎无法弄清楚如何正确地做到这一点：

'SELECT reports.* FROM reports
WHERE id IN 
(
  SELECT id FROM reports
  INNER JOIN units ON reports.unit_id = units.id
  ORDER BY time desc
  LIMIT 1
)

使用子查询或派生表执行此操作的任何解决方案？

Answer 1

在Postgres中执行此操作的简单方法是使用distinct on：

select distinct on (unit_id) r.*
from reports r
order by unit_id, time desc;

此构造特定于Postgres和使用其代码库的数据库。表达式distinct on (unit_id)表示“我希望每个unit_id只保留一行”。选择的行是基于unit_id子句的order by遇到的第一行。

编辑：

假设id与time字段一起增加，您的原始查询将是：

SELECT r.*
FROM reports r
WHERE id IN (SELECT max(id)
             FROM reports
             GROUP BY unit_id
            );

您也可以尝试not exists：

select r.*
from reports r
where not exists (select 1
                  from reports r2
                  where r2.unit_id = r.unit_id and
                        r2.time > r.time
                 );

我认为distinct on表现不错。最后一个版本（可能是之前的版本）真的会受益于reports(unit_id, time)上的索引。

使用单个查询来消除N + 1选择问题

1 个答案: