根据已连接表中的其他字段选择最新值

时间:2016-10-05 17:51:02

标签: sql sql-server

我正在使用SQLServer2008并遇到了一个我从未见过的问题。我有一个数据集,每个季度都会复制一些值。我试图选择每个季度的最新值。

SELECT PPAV.BusinessID
                        , (cast(year(PPAV.PartnerAttributeValueStartDate) as char(4)) + '0' + cast(datepart(qq, PPAV.PartnerAttributeValueStartDate) as char(1))) AS Quarter
                        , PAV.PartnerAttributeValue
                    FROM Partner_PartnerAttributeValue PPAV
                    JOIN PartnerAttributeValue PAV
                        ON PAV.PartnerAttributeValueID = PPAV.PartnerAttributeValueID
                    WHERE PAV.PartnerAttributeID = 7
                        AND (PPAV.PartnerAttributeValueID = 22 OR PPAV.PartnerAttributeValueID = 795 OR PPAV.PartnerAttributeValueID = 796)

                    GROUP BY PPAV.BusinessID
                            , (cast(year(PPAV.PartnerAttributeValueStartDate) as char(4)) + '0' + cast(datepart(qq, PPAV.PartnerAttributeValueStartDate) as char(1)))
                            , PAV.PartnerAttributeValue

这是问题产生的代码。我每季度只需要一个值。有时季度中期会发生变化,信息会重复。当我试图解决这个问题时,我正在使用这个代码,它实际上使问题变得更糟,因为问题Quarter有4个值。

SELECT PPAV.BusinessID
                        , (cast(year(PPAV.PartnerAttributeValueStartDate) as char(4)) + '0' + cast(datepart(qq, PPAV.PartnerAttributeValueStartDate) as char(1))) AS Quarter
                        , CASE WHEN (cast(year(PPAV.PartnerAttributeValueStartDate) as char(4)) + '0' + cast(datepart(qq, PPAV.PartnerAttributeValueStartDate) as char(1))) = (cast(year(PPAV.PartnerAttributeValueStartDate) as char(4)) + '0' + cast(datepart(qq, PPAV.PartnerAttributeValueStartDate) as char(1))) 
                            THEN SubHist.PartnerAttributeValue
                            ELSE PAV.PartnerAttributeValue 
                            END AS PartnerAttributeValue
                FROM Partner_PartnerAttributeValue PPAV
                JOIN PartnerAttributeValue PAV
                    ON PAV.PartnerAttributeValueID = PPAV.PartnerAttributeValueID
                JOIN ( SELECT PPAV.BusinessID
                                    , MAX(PPAV.PartnerAttributeValueStartDate) AS MAX
                                    , PAV.PartnerAttributeValue
                                FROM Partner_PartnerAttributeValue PPAV
                                JOIN PartnerAttributeValue PAV
                                    ON PAV.PartnerAttributeValueID = PPAV.PartnerAttributeValueID
                                WHERE PAV.PartnerAttributeID = 7
                                AND (PPAV.PartnerAttributeValueID = 22 OR PPAV.PartnerAttributeValueID = 795 OR PPAV.PartnerAttributeValueID = 796)
                                GROUP BY PAV.PartnerAttributeValue
                                        ,PPAV.BusinessID
                        )SubHist
                    ON SubHist.BusinessID = PPAV.BusinessID
                WHERE PAV.PartnerAttributeID = 7
                    AND (PPAV.PartnerAttributeValueID = 22 OR PPAV.PartnerAttributeValueID = 795 OR PPAV.PartnerAttributeValueID = 796)
                GROUP BY PPAV.BusinessID
                        , (cast(year(PPAV.PartnerAttributeValueStartDate) as char(4)) + '0' + cast(datepart(qq, PPAV.PartnerAttributeValueStartDate) as char(1)))
                        , PAV.PartnerAttributeValue
                        , SubHist.PartnerAttributeValue

我非常不确定我做了什么来使问题变得更糟。我认为我的CASE WHEN声明来自额外的连接表将修复它。

非常感谢任何帮助!

以下是我试图消除的一些示例数据

4356    201501  REGISTERED
4356    201502  REGISTERED
4356    201503  REGISTERED
4356    201504  REGISTERED
4356    201601  GOLD
4356    201601  REGISTERED
4356    201602  REGISTERED
4356    201603  REGISTERED
4356    201604  REGISTERED

问题在于2016年第一季度有多个值,数据会因此而出现偏差。应该只有GOLD值,而不是GOLD和Registered

谢谢!

1 个答案:

答案 0 :(得分:2)

使用窗口函数为每个季度和业务ID生成行号。然后仅限于每组的第1行编号(RN)......

由于必须先生成RN才能限制它,我们将它包装在CTE或子查询中然后应用RN = 1 ...

我也是:

  1. 将您的OR语句切换为IN,以提高可读性和性能。
  2. 修改了季度计算以使用concat而不是+ string aggregation。 (依赖于隐式转换,如果处理有效日期,则应该没问题)
  3. 任何这些额外的更改都可能引入了语法错误。

    UNTESTED 如果SQL Fiddle中提供了以下表格结构和示例数据,我会对其进行测试。

    Select * from (
    
        SELECT PPAV.BusinessID
             , concat(year(PPAV.PartnerAttributeValueStartDate)
                      , '0'
                      ,datepart(qq, PPAV.PartnerAttributeValueStartDate)
                     )
              AS Quarter
             , PAV.PartnerAttributeValue
             , row_number() 
               Over (PARTITION BY PPAV.BusinessID
                   , year(PPAV.PartnerAttributeValueStartDate)
                   , datepart(qq, PPAV.PartnerAttributeValueStartDate))
                     ORDER BY PartnerAttributeValueStartDate DESC) RN
         FROM Partner_PartnerAttributeValue PPAV
         JOIN PartnerAttributeValue PAV
           ON PAV.PartnerAttributeValueID = PPAV.PartnerAttributeValueID
         WHERE PAV.PartnerAttributeID = 7
           AND PPAV.PartnerAttributeValueID IN (22, 795,796)
         GROUP BY PPAV.BusinessID
               , concat(year(PPAV.PartnerAttributeValueStartDate)
                      , '0'
                      ,datepart(qq, PPAV.PartnerAttributeValueStartDate)
                     )
               , PAV.PartnerAttributeValue) cte
    
    from cte where RN = 1