使用下表
CREATE TABLE T1 (
A varchar(2),
B varchar(2)
);
INSERT INTO T1 VALUES
('aa', 'm'), ('aa', 'n'),
('bb', 'n'), ('bb', 'o'),
('cc', 'n'), ('cc', 'o'),
('dd', 'c'), ('dd', 'a'), ('dd', 'r'),
('ee', 'a'), ('ee', 'c'), ('ee', 'r')
A | B
----+----
aa | m
aa | n
bb | n
bb | o
cc | n
cc | o
dd | c
dd | a
dd | r
ee | a
ee | c
ee | r
如何选择和分组A中与B中所有相应值匹配的值。例如,bb和cc组成一个组,因为它们都包含'n'和'o'。
所以结果将是
Group | A
----------
1 | bb
1 | cc
2 | dd
2 | ee
答案 0 :(得分:2)
这是一种方法:它首先计算匹配的“集合”,其中集合是两个匹配的A
组。然后计算同一组中的“头”或最低A
。使用dense_rank
,您可以对头部进行编号,然后在集合列表中重新连接,以创建所有集合成员的列表。
在SE Data处查询。
; with groups as
(
select distinct A
from @t
)
, vals as
(
select distinct B
from @t
)
, sets as
(
select g1.A as g1
, g2.A as g2
from groups g1
join groups g2
on g1.A < g2.A
cross join
vals v
left join
@t v1
on g1.A = v1.A
and v.B = v1.B
left join
@t v2
on g2.A = v2.A
and v.B = v2.B
group by
g1.A
, g2.A
having count(case when isnull(v1.B,'') <> isnull(v2.B,'') then 1 end) = 0
)
, heads as
(
select s1.g1
, s1.g2
, head.head
from sets s1
cross apply
(
select min(g1) as head
from sets s2
where s1.g2 = s2.g2
) head
)
select distinct dense_rank() over (order by h.head)
, g.g
from (
select distinct head
from heads
) h
left join
(
select g1 as g
, head
from heads
union all
select g2
, head
from heads
) g
on h.head = g.head
答案 1 :(得分:1)
SQL Server 2008具有可以使用的EXCEPT
和INTERSECT
函数。这不是你想要的格式,我不能对大数据集的性能说话,但也许它会给你一个起点。
SELECT DISTINCT
T1.A,
T2.A
FROM
T1 AS T1
INNER JOIN T1 AS T2 ON T2.A > T1.A
WHERE
NOT EXISTS
(
SELECT
B
FROM
T1 AS T3
WHERE
T3.A = T1.A
EXCEPT
SELECT
B
FROM
T1 AS T4
WHERE
T4.A = T2.A
) AND
NOT EXISTS
(
SELECT
B
FROM
T1 AS T3
WHERE
T3.A = T2.A
EXCEPT
SELECT
B
FROM
T1 AS T4
WHERE
T4.A = T1.A
)
根据您的数据,您还可以生成一些带有分隔符和字符串中特定顺序的连接字符串,然后进行比较。
答案 2 :(得分:0)
您需要的关系运算符是division,通常称为"the supplier who supplies all parts"。
实际上,分区大约有八种风格,而SQL语言并没有直接实现它们。但是,它们都可以使用现有的SQL构造重新创建:有关更受欢迎的构造,请参阅this article。需要考虑的事项包括:精确划分或其余部分;如何处理一个空的除数。