从SQL查询重复

时间:2014-03-27 13:47:01

标签: sql sql-server tsql join

我有一个从多个连接中检索的数据集。我在我的语句中使用了SELECT DISTINCT,但我仍然在结果集中看到重复项。这是代码:

SELECT DISTINCT Account
, PayoffAmtDOL as 'Payoff Amount DOL'
, PayoffAmtLOG as 'Payoff Amount LOG'
, PayoffAmountLive as 'Payoff Amount Live'
, [Difference]
, PrincipalBalance as 'Principal Balance'
, CreationDate as 'Date Entered System'
, CACSState as 'CACS State at Entry'
, PaymentsMade AS 'Payments Made'
, TotalPaymentAmount as 'Total Payment Amount'
, 'Liquidation Percentage' = CASE WHEN PayoffAmountLive = 0 THEN 1
                            WHEN ISNULL([Difference],0) = ISNULL(PayoffAmtDOL, 0) THEN 1
                            WHEN ISNULL([Difference],0) < 0 AND ISNULL(PayoffAmtDOL, 0) > 0 THEN 0
                            WHEN ISNULL([Difference],0) > 0 AND ISNULL(PayoffAmtDOL, 0) < 0 THEN 1
                            WHEN ISNULL([Difference],0) > ISNULL(PayoffAmtDOL, 0) THEN 1
                            WHEN [Difference] > 0 AND ISNULL(PayoffAmtDOL, 0) = 0 THEN 1
                            WHEN ISNULL(PayoffAmtDOL, 0) = 0 THEN 0
                            ELSE ISNULL([Difference],0)/ISNULL(PayoffAmtDOL, 0) END
          , Cnt = 1
FROM 
(
SELECT DISTINCT a.Account,
       c.PayoffAmtDOL,
       c.PayoffAmtLOG,
       (ISNULL(c.PayoffAmtCACS, cacs.payoff_amt)) as 'PayoffAmountLive',
       (ISNULL(c.PayoffAmtDOL, 0) - (ISNULL(c.PayoffAmtCACS , ISNULL(cacs.payoff_amt, 0)))) as 'Difference',
       c.PrincipalBalance,
       c.CreationDate,
       c.CACSState,
       (SELECT COUNT(PaymentID)
        FROM tblATLPaymentInfo p
        WHERE p.AccountID = a.AccountID
          AND CONVERT(DATETIME, CONVERT(VARCHAR(10), p.CreationDate, 101)) >=  '1/1/2014'
          AND CONVERT(DATETIME, CONVERT(VARCHAR(10), p.CreationDate, 101)) <= '3/27/2014'
        ) as 'PaymentsMade',
         (SELECT SUM(PaymentAmount)
        FROM tblATLPaymentInfo p
        WHERE p.AccountID = a.AccountID
          AND CONVERT(DATETIME, CONVERT(VARCHAR(10), p.CreationDate, 101)) >= '1/1/2014'
          AND CONVERT(DATETIME, CONVERT(VARCHAR(10), p.CreationDate, 101)) <= '3/27/2014'
        ) as 'TotalPaymentAmount'

FROM tblATLAcctInfo a
RIGHT JOIN tblATLClaimInfo c
    ON c.AccountID = a.AccountID
LEFT JOIN SCFLOKYDCMSQL03.CACS_DM.dbo.Cacs_Info cacs
    ON cacs.Account = a.Account
WHERE CONVERT(DATETIME, CONVERT(VARCHAR(10), c.CreationDate, 101)) >= '1/1/2014'
    AND CONVERT(DATETIME, CONVERT(VARCHAR(10), c.CreationDate, 101)) <=  '3/27/2014'
    AND c.ClaimTypeID = (SELECT  DISTINCT ClaimTypeID FROM tblATLClaimType WHERE ClaimType = 'N02 - Claims')
) a
ORDER BY Account

以下是重复行的示例:

AccountID   DateEntered
123     01/19/2014
123     01/21/2014
345     02/1/2014
345     02/10/2014

之间的差异似乎是输入的日期。也许选择Row_Number()然后删除更晚的日期可能是一个解决方案

2 个答案:

答案 0 :(得分:1)

DISTINCT不应该返回多行..每行应该至少有一列不同,不是吗?对于字符数据,有时可能会被不可见的差异所欺骗,例如尾随空格。但不确定这是否是这种情况。

您能举例说明重复的行吗?

好的,我看到你的编辑了。您必须选择要显示的日期。试试这个以获得每个AccountID的最早日期:

 SELECT AccountID, MIN(DateEntered) AS DateEntered
 FROM ....
 GROUP BY AccountID
 ORDER BY AccountID

您可以在SELECT中添加更多列,只要它们不同,就不会获得更多行。

如果需要,可以在选择中添加COUNT(*)以获取分组的行数。

答案 1 :(得分:1)

DISTINCT只会拒绝完全重复的行,每个ID上的DateEntered都不同。如果您想要最新的,请使用Max(DateEntered)