获取第n个连续组的第一行/最后一行

时间:2013-12-04 16:11:04

标签: sql sql-server tsql sql-server-2005

从第n组中选择单个记录/值的最简单方法是什么?该组由材料及其价格决定(价格可能会发生变化)。我需要找到最后一个物料价格组的最后一个日期和最后一个日期。所以我想知道什么时候价格确实改变了。

我尝试过以下查询来获取当前(最后)价格的第一个日期,如果之前使用过该价格,则可以返回错误的日期:

DECLARE @material VARCHAR(20)
SET @material = '1271-4303'

SELECT TOP 1 Claim_Submitted_Date 
FROM   tabdata
WHERE Material = @material 
AND Price = (SELECT TOP 1 Price FROM tabdata t2 
             WHERE Material = @material
             ORDER BY Claim_Submitted_Date DESC)
ORDER BY Claim_Submitted_Date ASC

这也只返回最后一次,我怎么得到以前的?那么上次/先前使用上一个价格的日期?

我简化了我的架构并使用样本数据创建了this sql-fiddle。这里按时间顺序排列。因此ID为7的行是我需要的,因为它具有与最新日期的倒数第二个价格。

ID   CLAIM_SUBMITTED_DATE                   MATERIAL    PRICE
5   December, 04 2013 12:33:00+0000         1271-4303   20
4   December, 03 2013 12:33:00+0000         1271-4303   20   <-- current
3   November, 17 2013 10:13:00+0000         1271-4846   40
7   November, 08 2013 12:16:00+0000         1271-4303   18   <-- last(desired)
2   October, 17 2013 09:13:00+0000          1271-4303   18
1   September, 17 2013 08:13:00+0000        1271-4303   10
8   September, 16 2013 12:15:00+0000        1271-4303   17
6   June, 23 2013 14:22:00+0000             1271-4303   18
9   January, 11 2013 12:22:10+0000          1271-4303   20   <-- a problem since this is older than the desired but will be returned by my simply sub-query approach above

甚至可以参数化这个值,所以nthLatestPriceGroup如果我想知道第3个最后的价格日期?请注意,查询位于标量值函数中。

编辑:非常感谢大家。但不幸的是,一个简单的ROW_NUMBER似乎没有帮助,因为我试图获得给定材料的当前价格之前的最近价格的行。因此GROUP BY / PARTITION BY material,price包含价格相同的行,这些行不属于上一个最近的物料价格组。

考虑价格可以从

改变
Date             Price     Comment
5 months ago     20        original price, note that this is the same as the curent which causes my query to fail!
3 months ago     18        price has changed, i might need the first and last date
2 months ago     20        price has changed, i might need the first and last date
1 month ago      18        previous price, i need the oldest and newest dates 
NOW              20        current price, i need the first/oldest date from this group

所以我想要最后一组20组最近一行的日期,最老的20组是无关紧要的。所以我必须以某种方式按连续价格进行分组,因为价格在价格已经发生变化后可以重复。

所以实际上我只需要上面列表中以Claim_Submitted_Date开头的价格组中的最新1 month ago ... previous price,这是前一个价格有效的日期。注释中列出的其他信息很好(nthLatestPriceGroup子问题)。这是上面示例数据中ID=7的行。顺便说一下,这个价格组中最老的一行是ID=2(10月17日)而不是ID=6(6月23日),即使后者年龄较大。之后有不同的价格(10)。这就是为什么我不能使用简单的排名函数的原因。

4 个答案:

答案 0 :(得分:4)

您需要在子查询中使用窗口函数ROWNUMBER,...

这样的事情会让你到达那里:

ROW_NUMBER() OVER(PARTITION BY Price ORDER BY Claim_Submitted_Date DESC) AS Row 

这是基于你的小提琴的更新:

DECLARE @material VARCHAR(20)
SET @material = '1271-4303'


SELECT * FROM
(
SELECT  *,
        ROW_NUMBER() OVER(PARTITION BY Material ORDER BY Claim_Submitted_Date ASC) AS rn  
FROM tabdata t2 
WHERE Material = @material
) res
WHERE rn=2

如果idData是增量的(因此按时间顺序排列),你可以使用它:

SELECT * FROM
(
SELECT  *,
        ROW_NUMBER() OVER(PARTITION BY Material ORDER BY idData DESC) AS rn  
FROM tabdata t2 
WHERE Material = @material
) res

看看你的最新要求,我们都可以过度思考(如果我理解正确的话):

DECLARE @MATERIAL AS VARCHAR(9)
SET @MATERIAL = '1271-4303'

SELECT  TOP 1 *
FROM tabdata t2 
WHERE Material = @material
AND PRICE <> (  SELECT TOP 1 Price
                FROM tabdata 
                WHERE Material = @material 
                ORDER BY CLAIM_SUBMITTED_DATE desc)
ORDER BY CLAIM_SUBMITTED_DATE desc

--results
idData  Claim_Submitted_Date        Material    Price
7       2013-11-08 12:16:00.000     1271-4303   18

这是基于此的fiddle

答案 1 :(得分:2)

根据您的上一次评论,我所附带的解决方案是根据Claim_Submitted_Date计算不同的价格组,然后将获得的组索引作为分组标准的一部分。 不确定它会非常高效。希望它会有所帮助。

declare @materialId nvarchar(max), @targetrank int
set @materialId = '1271-4303'
set @targetrank =2


;with grouped as (
    select *, 
              (select count( t.price)  -- don't put a DISTINCT here. (I know, I did)
               from tabdata as t 
               where t.Price <> tj.Price 
                 and t.Claim_Submitted_Date> tj.Claim_Submitted_Date 
                  and t.Material= @materialId
              )as group_indicator 
      from tabdata tj 
      where Material= @materialId
), 
rankedClaims as
(
    select grouped.*, row_number() over (PARTITION BY material,price,group_indicator  ORDER BY claim_submitted_date desc) as rank
    from grouped
),
numbered as
(
   select *, ROW_NUMBER() OVER (order by Claim_Submitted_Date desc) as RowNumber from
   rankedClaims 
   where rank =1
)
select Id, Claim_Submitted_Date, Material, Price from numbered
    where RowNumber=@targetrank

(还不确定是否应对两个同一日期不同价格的索赔进行处理t.Claim_Submitted_Date> tj.Claim_Submitted_Date

-------------------- 上一个回答

也许你可以尝试类似的东西:

SELECT ranked.[CLAIM_SUBMITTED_DATE]
FROM
(
  SELECT trimmed.*, ROW_NUMBER() OVER (ORDER BY claim_submitted_date) AS rank FROM
  (
    SELECT a.*
      ,row_number() over (PARTITION BY material,price ORDER BY claim_submitted_date) AS daterank
    FROM tabdata a
    WHERE a.material= '1271-4303'
  )
  AS trimmed
  WHERE daterank=1
) AS ranked
WHERE rank=2

参数化排名似乎是可能的,因为它只涉及WHERE rank=2

答案 2 :(得分:2)

试试这个

DECLARE @material VARCHAR(20), @Nth INT
SET @material = '1271-4303'
SET @Nth = 2

;with CTE1 ([idData],[Claim_Submitted_Date], [Material], [Price], Rn)
as
(
SELECT  *,
        DENSE_RANK() OVER(ORDER BY PRICE DESC) AS rn  
FROM tabdata  
WHERE Material = @material
)
,
CTE2 ([idData],  [Material], [Price], LastDate)
AS(
SELECT [idData],  [Material], [Price], MAX([Claim_Submitted_Date])
FROM CTE1
WHERE rn = @Nth
GROUP BY [idData],  [Material], [Price]
)
SELECT Top 1 [idData],  [Material], [Price], LastDate
FROM CTE2 
ORDER BY LastDate DESC

结果集

idData  Material    Price   LastDate
  7     1271-4303   18      2013-11-08 12:16:00.000

答案 3 :(得分:1)

您是否尝试过诸如row_number()

之类的窗口函数
 select a.[IDDATA]
, a.[CLAIM_SUBMITTED_DATE]
, a.[MATERIAL]
 , a.[PRICE]
 ,row_number() over (PARTITION by material,price order by claim_submitted_date) as seq
 from tabdata a
 where a.material= '1271-4303'

SQLFiddle