在组的子集中查找“max”n行的ID的解决方案是什么?

时间:2013-03-21 15:37:15

标签: sql-server sql-server-2008 tsql

我们有表T,其中包含以下数据和结构

__________________________
ID  |  Grp   |     Dt     |
____|________|____________|
1   |   A    |  2007-11-22|  
2   |   A    |  2008-01-03|  
3   |   A    |  2008-01-03|  
4   |   A    |  2011-04-13|  
5   |   B    |  2007-11-22|  
6   |   B    |  2010-04-28|  
7   |   B    |  2009-03-19|  
8   |   B    |  2007-11-22|  
9   |   C    |  2010-04-28|  
10  |   C    |  2009-03-19|  
11  |   C    |  2011-04-13|  
12  |   C    |  2012-02-22|  
13  |   D    |  2007-11-22|  
14  |   D    |  2010-04-28|  
15  |   D    |  2009-03-19|  
16  |   E    |  2007-11-22|  
17  |   E    |  2010-04-28|  
18  |   E    |  2011-04-13|  
19  |   F    |  2007-11-22|  
20  |   G    |  2007-11-22|  
21  |   H    |  2007-11-22|  
22  |   H    |  2010-04-28|  
23  |   H    |  2009-03-19|  
24  |   H    |  2008-03-15|
____|________|____________|

给定@date_from = '2007-01-01'@date_to = '2008-06-01' 编写一个查询,将@date_from的已过滤子集的最大记录返回给@date_to。

结果应如下:

__________________________
ID  |  Grp   |     Dt     |
____|________|____________|
2   |   A    |  2008-01-03|  
3   |   A    |  2008-01-03|  
5   |   B    |  2007-11-22|  
8   |   B    |  2007-11-22|  
13  |   D    |  2007-11-22|  
16  |   E    |  2007-11-22|  
19  |   F    |  2007-11-22|  
20  |   G    |  2007-11-22|  
21  |   H    |  2008-03-05|  
____|________|____________|

一种可能的解决方案是:

DECLARE @date_from AS DATE = '2007-01-01'
DECLARE @date_to   AS DATE = '2008-06-01'

WITH TFltr AS ( SELECT ID, Grp, Dt FROM T WHERE @date_from <= Dt AND Dt <= @date_to )
SELECT t1.ID, t1.Grp, t1.Dt 
FROM TFltr t1
LEFT OUTER JOIN TFltr t2 ON t1.Grp = t2.Grp AND t1.Dt < t2.Dt
WHERE t2.ID IS NULL

所以你知道更好/更快的方法吗?

感谢。

4 个答案:

答案 0 :(得分:1)

select T1.*
from T T1
inner join
(
select max(dt) as max_dt, Grp
from T
where @date_from <= dt and dt <= @date_to
group by Grp
) X
on T1.Grp = X.Grp and T1.dt = X.max_dt

答案 1 :(得分:1)

select ID, Grp, Dt
from TFltr
where dt between @date_from  and @date_to
group by ID, Grp, Dt
having dt=max(dt)

我将上面的查询更改为:

选择ID为dt,Grp,Dt 从T @date_from和@date_to之间的dt ID,Grp,Dt 具有dt = max(dt)和ID = max(ID)

我得到了“更好”的结果,但它还不正确。

答案 2 :(得分:1)

我建议使用RANK分析函数:

DECLARE @date_from AS DATE = '2007-01-01'
DECLARE @date_to   AS DATE = '2008-06-01'

SELECT * FROM (
  SELECT ID, Grp, Dt,
    RANK() OVER (PARTITION BY Grp ORDER BY Dt DESC) AS DateRank
  FROM T
  WHERE Dt BETWEEN  @date_from AND @date_to) InnerT
WHERE DateRank = 1

内部查询将每个Grp内的日期从高到低排列。最高日期的DateRank为1.外部查询仅包含DateRank = 1的行。我对你帖子中的数据进行了这个查询,得到了你想要的结果。

答案 3 :(得分:1)

另一种解决方案:

<强> SQLFIDDLEExample

SELECT t.*
FROM TFltr t
WHERE t.Dt >='2007-01-01'
AND t.Dt <= '2008-06-01'
AND t.Dt = (SELECT MAX(h.Dt)
            FROM TFltr h
            WHERE h.Dt >='2007-01-01'
            AND h.Dt <= '2008-06-01' 
            AND h.Grp = t.Grp)
ORDER BY t.ID

结果:

| ID | GRP |                              DT |
----------------------------------------------
|  2 |   A |  January, 03 2008 00:00:00+0000 |
|  3 |   A |  January, 03 2008 00:00:00+0000 |
|  5 |   B | November, 22 2007 00:00:00+0000 |
|  8 |   B | November, 22 2007 00:00:00+0000 |
| 13 |   D | November, 22 2007 00:00:00+0000 |
| 16 |   E | November, 22 2007 00:00:00+0000 |
| 19 |   F | November, 22 2007 00:00:00+0000 |
| 20 |   G | November, 22 2007 00:00:00+0000 |
| 24 |   H |    March, 15 2008 00:00:00+0000 |

您的查询:

DECLARE @date_from AS DATE = '2007-01-01'
DECLARE @date_to   AS DATE = '2008-06-01'

SELECT t.*
FROM TFltr t
WHERE t.Dt >=@date_from
AND t.Dt <= @date_to
AND t.Dt = (SELECT MAX(h.Dt)
            FROM TFltr h
            WHERE h.Dt >=@date_from
            AND h.Dt <= @date_to 
            AND h.Grp = t.Grp)
ORDER BY t.ID