消除具有优先权的重复行

时间:2014-06-10 16:36:00

标签: sql sql-server sql-server-2012 duplicates

我正在研究将历史与投影相结合的存储过程。我有一个位列(PHP),指定投影是否优先于历史记录。我还有一个列,指定数据是来自历史记录还是投影表。我的存储过程的输出如下所示:

CaseId     Year     Projection    PHP    Gas   Oil
  1        2004         0          1    
  1        2005         0          1    
  1        2005         1          1    
  1        2006         1          1    
  1        2007         1          1    
  1        2008         1          1    
  1        2009         1          1    
  2        2003         0          0    
  2        2004         0          0    
  2        2005         0          0    
  2        2005         1          0    
  2        2006         1          0    
  2        2007         1          0    
  2        2008         1          0    
  2        2006         1          0    

在这个例子中,我需要消除第二行,因为CaseId 1预测具有优先权,因此应该删除重叠的历史日期。此外,应删除CaseId 2的第四行,因为历史记录优先。

CaseId     Year     Projection    PHP    Gas   Oil  
  1        2004         0          1    
  1        2005         1          1    
  1        2006         1          1    
  1        2007         1          1    
  1        2008         1          1    
  1        2009         1          1    
  2        2003         0          0    
  2        2004         0          0    
  2        2005         0          0    
  2        2006         1          0    
  2        2007         1          0    
  2        2008         1          0    
  2        2006         1          0

我需要在CaseId中标记重复年份,然后比较Projection和PHP列并删除它们不匹配的行。

以下是我正在使用的查询:

SELECT      rcl.ReportRunCaseId AS CaseId, 
            year(rce.EcoDate) as Year,
            1 as Projection,
            cpq.ProjectionHasPrecedence as PHP,
            rce.GrossOil as Oil,                
            rce.GrossGas as Gas    
  from  phdreports.PhdRpt.ReportCaseList_28 rcl 
         inner join phdreports.PhdRpt.RptCaseEco_28 rce on
            rce.ReportRunCaseId = rcl.ReportRunCaseId
         inner join dbo.caseQualifier cq on 
            cq.CorpScenarioId = 1 and 
            cq.CaseCaseId = rcl.ReportRunCaseId and 
            cq.CorpQualifierTypeId = 1
         inner join dbo.caseProjectionQualifier cpq on 
            cpq.CaseCaseId = rcl.ReportRunCaseId and 
            cpq.CorpQualifierId = cq.QualifierHasData 
where rcl.ReportRunCaseId <=2
group by year(rce.EcoDate), rcl.ReportRunCaseId, cpq.ProjectionHasPrecedence, rce.GrossGas, rce.GrossOil

union all

select      rmp.ReportRunCaseId AS CaseId, 
            year(rmp.EcoDate) as Year,
            0 as Projection,
            cpq.ProjectionHasPrecedence as PHP,
            rmp.GrossOil as Oil,
            rmp.GrossGas as Gas              
from PhdReports.PhdRpt.RptMonthlyProduction_50 rmp
        inner join dbo.caseQualifier cq on 
          cq.CorpScenarioId = 1 and 
          cq.CaseCaseId = rmp.ReportRunCaseId and 
          cq.CorpQualifierTypeId = 1
        inner join dbo.caseProjectionQualifier cpq on 
          cpq.CaseCaseId = rmp.ReportRunCaseId and 
          cpq.CorpQualifierId = cq.QualifierHasData 
where rmp.ReportRunCaseId <= 2
group by year(rmp.EcoDate), rmp.ReportRunCaseId, cpq.ProjectionHasPrecedence, rmp.GrossGas, rmp.GrossOil 

如何消除Projection和PHP不匹配的重复年份?

2 个答案:

答案 0 :(得分:2)

ROW_NUMBER()函数应该在这里帮助你:

WITH Data AS
(   SELECT      rcl.ReportRunCaseId AS CaseId, 
                year(rce.EcoDate) as Year,
                1 as Projection,
                cpq.ProjectionHasPrecedence as PHP,
                rce.GrossOil as Oil,                
                rce.GrossGas as Gas    
      from  phdreports.PhdRpt.ReportCaseList_28 rcl 
             inner join phdreports.PhdRpt.RptCaseEco_28 rce on
                rce.ReportRunCaseId = rcl.ReportRunCaseId
             inner join dbo.caseQualifier cq on 
                cq.CorpScenarioId = 1 and 
                cq.CaseCaseId = rcl.ReportRunCaseId and 
                cq.CorpQualifierTypeId = 1
             inner join dbo.caseProjectionQualifier cpq on 
                cpq.CaseCaseId = rcl.ReportRunCaseId and 
                cpq.CorpQualifierId = cq.QualifierHasData 
    where rcl.ReportRunCaseId <=2
    group by year(rce.EcoDate), rcl.ReportRunCaseId, cpq.ProjectionHasPrecedence, rce.GrossGas, rce.GrossOil

    union all

    select      rmp.ReportRunCaseId AS CaseId, 
                year(rmp.EcoDate) as Year,
                0 as Projection,
                cpq.ProjectionHasPrecedence as PHP,
                rmp.GrossOil as Oil,
                rmp.GrossGas as Gas              
    from PhdReports.PhdRpt.RptMonthlyProduction_50 rmp
            inner join dbo.caseQualifier cq on 
              cq.CorpScenarioId = 1 and 
              cq.CaseCaseId = rmp.ReportRunCaseId and 
              cq.CorpQualifierTypeId = 1
            inner join dbo.caseProjectionQualifier cpq on 
              cpq.CaseCaseId = rmp.ReportRunCaseId and 
              cpq.CorpQualifierId = cq.QualifierHasData 
    where rmp.ReportRunCaseId <= 2
    group by year(rmp.EcoDate), rmp.ReportRunCaseId, cpq.ProjectionHasPrecedence, rmp.GrossGas, rmp.GrossOil
), Data2 AS
(   SELECT  *, 
            RowNum = ROW_NUMBER() OVER(PARTITION BY CaseId, Year 
                                        ORDER BY CASE WHEN PHP = Projection THEN 0 ELSE 1 END DESC, PHP DESC, Projection DESC)
    FROM    Data
)
SELECT  CaseId, Year, Projection, PHP, Oil, Gas
FROM    Data2
WHERE   RowNum - 1;

仅考虑最后一位,因为第一位只是您在公共表表达式中的查询:

RowNum = ROW_NUMBER() OVER(PARTITION BY CaseId, Year 
                            ORDER BY CASE WHEN PHP = Projection THEN 0 ELSE 1 END DESC, PHP DESC, Projection DESC)

在这里,我们给每个caseIdyear元组一个等级,按PHP是否等于投影排序。然后最后一部分只是将结果限制为每个元组的第一行,所以如果存在一个等于它们的行,那么如果没有行,则它们相等,它们不相等的行将被使用。

您可能需要在订单中添加更多标准,以确保结果是确定性的,即如果同一caseId / Year有两行,其中PHP和投影都是1,请确保选择相同的行每一次。

答案 1 :(得分:1)

我不知道您的查询与该问题有什么关系。所以,我假设您有一个查询:

select CaseId, Year, Projection, PHP, Gas, Oil 
from t

有了这个,你可以使用row_number()

做你想做的事
select CaseId, Year, Projection, PHP, Gas, Oil
from (select CaseId, Year, Projection, PHP, Gas, Oil,
             row_number() over (partition by CaseId, Year
                                order by Projection + PHP desc
                               ) as seqnum
      from t
     ) t
where seqnum = 1;

这将根据设置的标志数量对行进行优先级排序。在CaseId = 2的示例中,这两行包含相同的值。这将返回其中一行。如果要在它们之间进行选择,则需要另一列,因此请指定优先级。