如何在SQL Server

时间:2017-04-04 20:46:21

标签: sql sql-server tsql gaps-and-islands

我想从SQL Server表中选择重复的条目,但前提是id是连续的。

我一直在努力扭转this answer我的需求,但我无法让它发挥作用。

以上答案适用于Oracle,但我发现SQL Server还具有leadlag功能。

此外,我认为上面的答案会将*放在重复项旁边,但我只想选择重复项。

select 
    id, companyName, 
    case 
       when companyName in (prev, next) 
          then '*' 
    end match, 
    prev, 
    next 
from 
    (select
         id,
         companyName,
         lag(companyName, 1) over (order by id) prev,
         lead(companyName, 1) over (order by id) next
     from
         companies)
order by
    id;

示例

所以从这个数据集:

id      companyName
-------------------    
1       dogs ltd
2       cats ltd
3       pigs ltd
4       pigs ltd
5       cats ltd
6       cats ltd
7       dogs ltd
8       pigs ltd

我想选择:

id      companyName
-------------------    
3       pigs ltd
4       pigs ltd
5       cats ltd
6       cats ltd

更新

我一次又一次地对我得到的答案的数量和质量感到吃惊。这是其中一次。我没有专业水平来判断一个答案比另一个好,所以我去了SqlZim,因为这是我看到的第一个有效的答案。但是很高兴看到不同的方法。特别是在一个小时前,我想知道“这有可能吗?”。

5 个答案:

答案 0 :(得分:5)

你非常接近你想要的东西:

select id, companyName
from (select c.*,
             lag(companyName, 1) over (order by id) prev,
             lead(companyName, 1) over (order by id) next
      from companies c
     ) a
where CompanyName in (prev, next)
order by id;

答案 1 :(得分:2)

这是一个空白和岛屿样式问题,但我们在最里面的子查询中使用row_numbers()id,而不是使用两个row_number()。随后count() over()grp计算,最后返回cnt > 1。{/ p>

select id, companyname 
from (
  select 
      id
    , companyName
    , grp
    , cnt = count(*) over (partition by companyname, grp)
  from (
    select *
      , grp = id - row_number() over (partition by companyname order by id)
    from
      companies
    ) islands
  ) d
where cnt  > 1
order by id

rextester演示:http://rextester.com/ACP73683

返回:

+----+-------------+
| id | companyname |
+----+-------------+
|  3 | pigs ltd    |
|  4 | pigs ltd    |
|  5 | cats ltd    |
|  6 | cats ltd    |
+----+-------------+

答案 2 :(得分:2)

在WHERE子句中,您只需要限制公司名称与上一个或下一个相同的那些

select id, companyName
from (
   select id, companyName,
   lag(companyName, 1) over (order by id) as prev,
   lead(companyName, 1) over (order by id) as next
   from companies
 ) q
 where companyName in (prev, next)
 order by id;

为了确保id真的没有间隙,你可以这样做:

select id, companyName
from (
   select id, companyName,
   lag(concat(id+1,companyName), 1) over (order by id) as prev,
   lead(concat(id-1,companyName), 1) over (order by id) as next
   from companies
 ) q
 where concat(id,companyName) in (prev, next)
 order by id;

答案 3 :(得分:2)

另一种替代形式,使用LEAD()和LAG()(SQL 2012及更高版本)

SELECT id, CompanyName
FROM (
    SELECT *,
        LEAD(CompanyName, 1) OVER(ORDER BY id) as nc,
        LAG(CompanyName, 1) OVER(ORDER BY id) AS pc
    FROM #t t
    ) x
WHERE nc = companyName
    OR pc = companyName

以下是测试数据,因此您可以自行查看。

CREATE TABLE #T (id int not null PRIMARY KEY, companyName varchar(16) not null)

INSERT INTO #t Values 
(1,       'dogs ltd'),
(2,       'cats ltd'),
(3,       'pigs ltd'),
(4,       'pigs ltd'),
(5,       'cats ltd'),
(6,       'cats ltd'),
(7,       'dogs ltd'),
(8,       'pigs ltd')

答案 4 :(得分:1)

您可以使用Row_Number()并根据分区依据

获取重复项
;with cte as (
SELECT id, companyName,
    RowN = Row_Number() over (partition by id order by companynae) from #yourTable
    )
    Select * from cte where RowN > 1

您能否提供输入和预期输出以验证此查询