在MS SQL表中查找重复项

时间:2015-10-12 08:51:11

标签: sql sql-server

我知道这个问题已被多次询问,但我仍然无法弄清楚为什么我的查询返回的值不是重复的。我希望我的查询只返回列Credit中具有相同值的记录。查询执行时没有任何错误,但也返回了不重复的值。这是我的问题:

Select
  _bvGLTransactionsFull.AccountDesc,
  _bvGLAccountsFinancial.Description,
  _bvGLTransactionsFull.TxDate,
  _bvGLTransactionsFull.Description,
  _bvGLTransactionsFull.Credit,
  _bvGLTransactionsFull.Reference,
  _bvGLTransactionsFull.UserName
From
  _bvGLAccountsFinancial Inner Join
  _bvGLTransactionsFull On _bvGLAccountsFinancial.AccountLink =
    _bvGLTransactionsFull.AccountLink
Where
  _bvGLTransactionsFull.Credit 

IN   
  (SELECT Credit AS NumOccurrences
FROM  _bvGLTransactionsFull
GROUP BY Credit
HAVING (COUNT(Credit) > 1 ) )


Group By
  _bvGLTransactionsFull.AccountDesc, _bvGLAccountsFinancial.Description,
  _bvGLTransactionsFull.TxDate, _bvGLTransactionsFull.Description,
  _bvGLTransactionsFull.Credit, _bvGLTransactionsFull.Reference,
  _bvGLTransactionsFull.UserName, _bvGLAccountsFinancial.Master_Sub_Account,
  IsNumeric(_bvGLTransactionsFull.Reference), _bvGLTransactionsFull.TrCode
Having
  _bvGLTransactionsFull.TxDate > 01 / 11 / 2014 And
  _bvGLTransactionsFull.Reference Like '5_____' And
  _bvGLTransactionsFull.Credit > 0.01 And
  _bvGLAccountsFinancial.Master_Sub_Account = '90210'

2 个答案:

答案 0 :(得分:0)

那是因为您在信用点上匹配回到您的表,其中包含重复项。您需要隔离与ROW_NUMBER重复的行:

;WITH CTE AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY CREDIT ORDER BY (SELECT NULL)) AS RN
FROM _bvGLTransactionsFull)
Select
  CTE.AccountDesc,
  _bvGLAccountsFinancial.Description,
  CTE.TxDate,
  CTE.Description,
  CTE.Credit,
  CTE.Reference,
  CTE.UserName
From
  _bvGLAccountsFinancial Inner Join
  CTE On _bvGLAccountsFinancial.AccountLink = CTE.AccountLink
WHERE CTE.RN > 1
Group By
  CTE.AccountDesc, _bvGLAccountsFinancial.Description,
  CTE.TxDate, CTE.Description,
  CTE.Credit, CTE.Reference,
  CTE.UserName, _bvGLAccountsFinancial.Master_Sub_Account,
  IsNumeric(CTE.Reference), CTE.TrCode
Having
  CTE.TxDate > 01 / 11 / 2014 And
  CTE.Reference Like '5_____' And
  CTE.Credit > 0.01 And
  _bvGLAccountsFinancial.Master_Sub_Account = '90210'

作为旁注,我会考虑使用别名来缩短查询并使其更具可读性。在连接中的每个列之前加上表名前缀是非常难以阅读的。

答案 1 :(得分:0)

我相信您的代码可以根据您的条件提取所有数据。有了这个,让我有一个不同的方法,看看你的脚本“按原样”。那么,让我们先保留一个临时的所有记录。

Select
  _bvGLTransactionsFull.AccountDesc,
  _bvGLAccountsFinancial.Description,
  _bvGLTransactionsFull.TxDate,
  _bvGLTransactionsFull.Description,
  _bvGLTransactionsFull.Credit,
  _bvGLTransactionsFull.Reference,
  _bvGLTransactionsFull.UserName
-- temp table
INTO #tmpTable
From
  _bvGLAccountsFinancial Inner Join
  _bvGLTransactionsFull On _bvGLAccountsFinancial.AccountLink =
    _bvGLTransactionsFull.AccountLink
Where
  _bvGLTransactionsFull.Credit 

IN   
  (SELECT Credit AS NumOccurrences
FROM  _bvGLTransactionsFull
GROUP BY Credit
HAVING (COUNT(Credit) > 1 ) )


Group By
  _bvGLTransactionsFull.AccountDesc, _bvGLAccountsFinancial.Description,
  _bvGLTransactionsFull.TxDate, _bvGLTransactionsFull.Description,
  _bvGLTransactionsFull.Credit, _bvGLTransactionsFull.Reference,
  _bvGLTransactionsFull.UserName, _bvGLAccountsFinancial.Master_Sub_Account,
  IsNumeric(_bvGLTransactionsFull.Reference), _bvGLTransactionsFull.TrCode
Having
  _bvGLTransactionsFull.TxDate > 01 / 11 / 2014 And
  _bvGLTransactionsFull.Reference Like '5_____' And
  _bvGLTransactionsFull.Credit > 0.01 And
  _bvGLAccountsFinancial.Master_Sub_Account = '90210'

然后通过创建行索引删除“单次出现”数据并删除所有这些1次索引。

SELECT * FROM (
SELECT
  ROW_NUMBER() OVER (PARTITION BY Credit ORDER BY Credit) AS rowIdx
  , *
FROM #tmpTable) AS innerTmp
WHERE
  rowIdx != 1

您可以通过PARTITION BY <column name>更改您的偏好设置。 如果您有任何疑虑,请先提出,因为到目前为止我是如何理解您的情况的。

编辑:包括那些有重复的信用。

SELECT 
    tmp1.*
FROM #tmpTable tmp1
RIGHT JOIN (
       SELECT 
           Credit 
       FROM (
           SELECT
              ROW_NUMBER() OVER (PARTITION BY Credit ORDER BY Credit) AS rowIdx
              , *
           FROM #tmpTable) AS innerTmp
           WHERE
              rowIdx != 1
       ) AS tmp2
 ON tmp1.Credit = tmp2.Credit