返回记录的重复计数

时间:2018-10-02 13:19:35

标签: sql sql-server

如何添加一个计数以显示根据sql中的规则发现的重复项数量?我试图找到重复的记录,其中它们仅包含inc和公司注册号所在的国家/地区具有重复记录。因此,我只想查找gtid对于每个重复项都是唯一的重复项,并且成立国家/地区以及公司注册号的计数均大于1

  select distinct top 10000 wp1.GtId, 
  wp1.CrmPartyId, 
  wp1.LegalName, 
  wp1.BusinessClass, 
  wp1.RmFullName, 
  wp1.PbeFullName, 
  wp1.OverallClientStatus, 
  wp1.OverallRpStatus, 
  wp1.FirstName,
  wp1.LastName,  
 wp1.CompanyRegNum,
   wp1.CountryInc,
  wp2.GtId, 
  wp2.CrmPartyId, 
  wp2.LegalName, 
  wp2.BusinessClass, 
  wp2.RmFullName, 
  wp2.PbeFullName, 
  wp2.OverallClientStatus, 
  wp2.OverallRpStatus,  
  wp2.FirstName,
  wp2.LastName,
     wp2.CompanyRegNum,
     wp2.CountryInc
  from CORE.WeccoParty wp1
  join CORE.WeccoParty wp2 on   wp1.CompanyRegNum = wp2.CompanyRegNum
                     and  wp1.CountryInc  = wp2.CountryInc
                     and  wp1.GtId     <> wp2.GtId

  where wp1.CompanyRegNum is not null
  and wp1.OverallClientStatus = 'Onboarded' and wp2.OverallClientStatus = 
 'Onboarded'
  and wp1.OverallRpStatus = 'Onboarded' and wp2.OverallRpStatus = 
  'Onboarded'
  and lower(WP1.CompanyRegNum) NOT IN     
  ('0','.','n.a','n/a','n.a.','00000','unknown','Unknown','000000','00000000')
  and wp1.CompanyRegNum NOT LIKE('^0*0$')
  and   wp1.CountryInc is not null

2 个答案:

答案 0 :(得分:0)

使用窗口函数OVER()和您在PARTITION BY中重复的规则

duplicate_count = count(*) OVER ( wp2.CompanyRegNum, wp2.CountryInc )

答案 1 :(得分:0)

使用RANK Rank Function或计数

下面是两个示例,如果我对您的理解正确的话,该如何获得所需的东西。这对于查找表中存在的组合/重复项很有用。

使用RANK: 如果rn> 1,则表示该组合中有多次出现(具有不同的GtId)。

您可以将其放在查询中,如果将查询作为内部查询,则可以简单地选择rn> 1的所有行以查找所需的组合。 但是请记住,即使该组合存在重复,该行仍可以具有rn = 1。

RANK() OVER(Partition by wp1.CountryInc,wp2.CompanyRegNum ORDER BY wp2.GtId desc ) as rn

select * from 
    (Select wp1.CountryInc,wp2.CompanyRegNum,wp2.GtId,
     ROW_NUMBER() OVER(Partition by wp1.CountryInc,wp2.CompanyRegNum OVER wp2.GtId ) as rn
  from CORE.WeccoParty wp1
  join CORE.WeccoParty wp2 on   wp1.CompanyRegNum = wp2.CompanyRegNum
                     and  wp1.CountryInc  = wp2.CountryInc
                     and  wp1.GtId     <> wp2.GtId

  where wp1.CompanyRegNum is not null
  and wp1.OverallClientStatus = 'Onboarded' and wp2.OverallClientStatus = 
 'Onboarded'
  and wp1.OverallRpStatus = 'Onboarded' and wp2.OverallRpStatus = 
  'Onboarded'
  and lower(WP1.CompanyRegNum) NOT IN     
  ('0','.','n.a','n/a','n.a.','00000','unknown','Unknown','000000','00000000')
  and wp1.CompanyRegNum NOT LIKE('^0*0$')
  and   wp1.CountryInc is not null )q
  where q.rn >1

使用计数:

 Select wp1.CountryInc,wp2.CompanyRegNum, COUNT(Distinct wp2.GtId) as NoOfDuplicates
 from CORE.WeccoParty wp1
  join CORE.WeccoParty wp2 on   wp1.CompanyRegNum = wp2.CompanyRegNum
                     and  wp1.CountryInc  = wp2.CountryInc
                     and  wp1.GtId     <> wp2.GtId

  where wp1.CompanyRegNum is not null
  and wp1.OverallClientStatus = 'Onboarded' and wp2.OverallClientStatus = 
 'Onboarded'
  and wp1.OverallRpStatus = 'Onboarded' and wp2.OverallRpStatus = 
  'Onboarded'
  and lower(WP1.CompanyRegNum) NOT IN     
  ('0','.','n.a','n/a','n.a.','00000','unknown','Unknown','000000','00000000')
  and wp1.CompanyRegNum NOT LIKE('^0*0$')
  and   wp1.CountryInc is not null 
  group by wp1.CountryInc,wp2.CompanyRegNum
  having COUNT(Distinct wp2.GtId)  >1