Question

我有一个表为每个reference存储product-identifier，但是有一些重复的记录 - 即。产品可能已经多次提交，因此有多个参考。每条记录都带有updated列的时间戳。

我需要一个只能为每个product-identifier提供一个（非空）引用的查询，但关键是只选择每个产品的最新记录。

所以如果原始表是这样的：

id    updated              product-identifier   reference
------------------------------------------------------------
1     2014-11-10 07:47:02  9876543210123        98043hjdww98324322
2     2014-11-10 07:53:24  9897434243242        89f7e9wew329f080re
3     2014-11-12 10:51:10  9876543210123        48308402jfjewkfwek
4     2014-11-12 12:53:24  9876543210123        89739432bkjfekwjfk
5     2014-11-12 12:55:16  9876543210321        21321hhfioefhewfoe
6     2014-11-13 01:01:10  9897434243242      
7     2014-11-13 01:05:24  9897434243242        1232423jhdksffewfe

查询应该只返回这些记录：

id    updated              product-identifier   reference
------------------------------------------------------------
4     2014-11-12 12:53:24  9876543210123        89739432bkjfekwjfk
5     2014-11-12 12:55:16  9876543210321        21321hhfioefhewfoe
7     2014-11-13 01:05:24  9897434243242        1232423jhdksffewfe

我试过了

SELECT * FROM tablename WHERE reference !='' GROUP BY product-identifier ORDER BY updated DESC

这只为每个产品提供了一条记录，但不是最新产品 - 它在分拣前进行分组。

非常感谢！

Answer 1

有很多方法可以做到这一点。如果您想要最新记录，请使用not exists：

select t.*
from tablename t
where not exists (select 1
                  from tablename t2
                  where t2.product_identifier = t.product_identifier and
                        t2.updated > t.updated
                 );

Answer 2

我通常通过为每个组选择最高时间戳的子查询（在您的情况下为product_identfier）然后使用它来选择我想要的行来执行此操作。喜欢这个

select * 
  from tablename a
 where a.updated = (select max(updated) 
                      from tablename b
                     where a.product_identifier = b.product_identifier)

SQL group by，如何定义每个组的记录，例如。最新的，是用的

2 个答案: