查找具有匹配行的组

时间:2017-03-09 16:39:32

标签: sql sql-server

我有一张人(CarOwners)表和他们拥有的汽车类型

+-------+-------+
| Name  | Model |
+-------+-------+
| Bob   | Camry |
| Bob   | Civic |
| Bob   | Prius |
| Kevin | Civic |
| Kevin | Focus |
| Mark  | Civic |
| Lisa  | Focus |
| Lisa  | Civic |
+-------+-------+

鉴于名称,我如何找到完全相同车辆的其他人?例如,如果我的目标是马克,没有其他人只有思域,所以查询不会返回任何内容。如果我的目标是Lisa,则查询将返回

+-------+-------+
| Name  | Model |
+-------+-------+
| Kevin | Civic |
| Kevin | Focus |
+-------+-------+

因为凯文拥有与丽莎完全相同的汽车。如果我定位凯文,查询将返回Lisa。

我创建了一个包含我的目标人员汽车的cte,但我不确定如何实现“完全匹配”的要求。我的所有尝试都返回子集匹配的结果。

with LisaCars as (
    SELECT Model FROM CarOwners WHERE Name = 'Lisa'
)
SELECT Name, Model
FROM CarOwners
WHERE Model in (SELECT * FROM LisaCars) AND Name != 'Lisa'

此查询将返回所有拥有Civic或Focus的人,这不是我正在寻找的。

+-------+-------+
| Name  | Model |
+-------+-------+
| Bob   | Civic |
| Kevin | Civic |
| Kevin | Focus |
| Mark  | Civic |
+-------+-------+

5 个答案:

答案 0 :(得分:1)

使用common table expression(cte)和OS : Ubuntu 16.10 emcc Location: /usr/bin/emcc over()计算每个name的行数。

然后count() cte使用名称不匹配的自联接,模型匹配,每个名称匹配的模型计数,其中一个名称为matches'Lisa'子句确保匹配的行数(having)与count(*)所拥有的模型数相匹配。

name本身只返回每个人的matches,因此我们加入到源表name以获取每个匹配的完整模型列表。

t

rextester演示:http://rextester.com/SUKP78304

返回:

;with cte as (
  select *
    , cnt = count(*) over (partition by name)
  from t
)
, matches as (
  select x2.name
  from cte as x 
    inner join cte as x2
       on x.name <> x2.name
      and x.model = x2.model
      and x.cnt   = x2.cnt 
      and x.name  = 'Lisa'
  group by x2.name, x.cnt
  having count(*) = x.cnt
)
select t.* 
from t
  inner join matches m
    on t.name = m.name

我们也可以在没有ctes的情况下编写它,但它使得它更难以遵循:

+-------+-------+
| name  | model |
+-------+-------+
| Kevin | Civic |
| Kevin | Focus |
+-------+-------+

答案 1 :(得分:0)

执行此操作的一种方法是比较每个名称的有序连接模型值。

with cte as (
select name,model,
     STUFF((
          SELECT ',' + t2.model
          FROM t t2
          WHERE t1.name=t2.name
          ORDER BY model
          FOR XML PATH(''), TYPE).value('.', 'VARCHAR(MAX)'), 1, 1, '') concat_value
from t t1 
) 
select distinct x2.name,x2.model
from cte x1
join cte x2 on x1.concat_value=x2.concat_value and x1.name<>x2.name
where x1.name='Kevin'

如果您的SQL Server版本支持STRING_AGG,则查询可以简化为

with cte as (
    select name,model,
         STRING_AGG(model,',') WITHIN GROUP(ORDER BY model) as concat_value
    from t t1 
    ) 
select distinct x2.name,x2.model
from cte x1
join cte x2 on x1.concat_value=x2.concat_value and x1.name<>x2.name
where x1.name='Kevin'

答案 2 :(得分:0)

试试吧

if object_id('tempdb.dbo.#temp') is not null
drop table #temp

create table #temp (name varchar(100),model varchar(100))

insert into #temp values('Bob','Camry')
insert into #temp values('Bob','Civic')
insert into #temp values('Bob','Prius')
insert into #temp values('Kevin','Focus')
insert into #temp values('Kevin','Civic')
insert into #temp values('Mark','Civic')
insert into #temp values('Lisa','Focus')
insert into #temp values('Lisa','Civic')

select * from (
select row_number() over(partition by name order by (select null)) as n,
row_number() over(partition by model order by (select null)) as m,*
from #temp) as a
where  n = m
order by name

答案 3 :(得分:0)

由于您希望您的匹配完全,我们应该添加每个人拥有的汽车数量作为附加字段。假设您的表名是&#39;#Owner&#39;以下查询

select  *
        , (select COUNT(*)
            from #owners o2
            where o2.name = o1.name) as num
    from #owners o1

给我们提供表格

+-------+-------+-----+
| Name  | Model | num |
+-------+-------+-----+
| Bob   | Camry | 3   |
| Bob   | Civic | 3   |
| Bob   | Prius | 3   | 
| Kevin | Civic | 2   |
| Kevin | Focus | 2   |
| Mark  | Civic | 1   |
| Lisa  | Focus | 2   |
| Lisa  | Civic | 2   |
+-------+-------+-----+

然后我们想要将此表连接到自己匹配的模型和计数。我们使用CTE使其读数更好。以下查询

; with
    OwnedCount as (
        select  *
                , (select COUNT(*)
                    from #owners o2
                    where o2.name = o1.name) as num
            from #owners o1
    )
select *
    from OwnedCount o1
    inner join OwnedCount o2
        on o1.model = o2.model 
        and o1.num = o2.num

给我们这张表

+-------+-------+-----+-------+-------+-----+
| Name  | Model | num | Name  | Model | num |
+-------+-------+-----+-------+-------+-----+
| Bob   | Camry | 3   | Bob   | Camry | 3   |
| Bob   | Civic | 3   | Bob   | Civic | 3   |
| Bob   | Prius | 3   | Bob   | Prius | 3   |
| Kevin | Civic | 2   | Kevin | Civic | 2   |
| Kevin | Civic | 2   | Lisa  | Civic | 2   |
| Kevin | Focus | 2   | Kevin | Focus | 2   |
| Kevin | Focus | 2   | Lisa  | Focus | 2   |
| Mark  | Civic | 1   | Mark  | Civic | 1   |
| Lisa  | Civic | 2   | Kevin | Civic | 2   |
| Lisa  | Civic | 2   | Lisa  | Civic | 2   |
| Lisa  | Focus | 2   | Kevin | Focus | 2   |
| Lisa  | Focus | 2   | Lisa  | Focus | 2   |
+-------+-------+-----+-------+-------+-----+

最后,您按照所需名称过滤结果

declare @given_name varchar(32) = 'Lisa'
; with
    OwnedCount as (
        select  *
                , (select COUNT(*)
                    from #owners o2
                    where o2.name = o1.name) as num
            from #owners o1
    )
select o2.name, o2.model
    from OwnedCount o1
    inner join OwnedCount o2
        on o1.model = o2.model 
        and o1.num = o2.num
    where o1.name = @given_name
        and o2.name <> @given_name

答案 4 :(得分:0)

尝试这个,我认为只用一个分区功能就可以轻松实现短代码。

    declare @t table(Name varchar(50),Model varchar(50))
    insert into @t values
    ('Bob','Camry')
    ,('Bob','Civic')
    ,('Bob','Prius')
    ,('Kevin','Civic')
    ,('Kevin','Focus')
    ,('Mark','Civic')
    ,('Lisa','Focus')
    ,('Lisa','Civic')

    declare @input varchar(50)='Lisa'

    ;with 
CTE1 AS
(
select name,model,ROW_NUMBER()over( order by name) rn
 from @t
where name=@input
)
,cte2 as
(
select t.name,t.Model
,ROW_NUMBER()over(partition by t.name order by t.name) rn3
from @t t 
inner JOIN
cte1    c on t.Model=c.model 
where   t.Name !=@input
)
select * from cte2 c
where exists(select rn3 from cte2 c1 
where c1.name=c.name and c1.rn3=(select max(rn) from cte1)
)