将组编号分配给行

时间:2018-01-28 15:34:09

标签: mysql sql

我有一张如下所示的表格

record  similar_record
rec_1   rec_2
rec_3   rec_4
rec_2   rec_3
rec_5   rec_7

以上数据显示哪两个记录相似。 例如:在上面的数据集中,rec_1类似于rec_2,rec_2类似于rec_3,rec_3类似于rec_4,因此它们必须转到一个组。 rec_5和rec_7相似,因此它们形成一个组。我们必须生成组标识符,它们不必是整数。

我正在尝试在MySQL上编写SQL查询以生成以下输出。

group  record
1      rec_1
1      rec_2
1      rec_3
1      rec_4
2      rec_5
2      rec_7

记录不必在一个单独的行中,如果结果是由GROUP_CONCAT获得的,那么每个组都有一些分隔符。

有人可以帮我查询吗?

1 个答案:

答案 0 :(得分:1)

这是一种递归蛮力方法。适用于MySQL 8.也适用于MariaDB 10.2:

create table graph (
  node1 varchar(50),
  node2 varchar(50)
);
insert into graph (node1, node2) values
    ('rec_1', 'rec_2'),
    ('rec_3', 'rec_4'),
    ('rec_2', 'rec_3'),
    ('rec_5', 'rec_7');

with recursive numerated as (
  select g.*, ROW_NUMBER() OVER (PARTITION BY null ORDER BY node1) as grp
  from graph g
), normalized as (
  select grp, node1 as node from numerated
  union distinct
  select grp, node2 as node from numerated
), rcte as (
  select n.grp as grp1, n.*
  from normalized n
  union all
  select rcte.grp1 as grp1, n2.grp, n2.node
  from rcte
  join normalized n1 on n1.node = rcte.node and n1.grp > rcte.grp
  join normalized n2 on n2.node <> n1.node and n2.grp = n1.grp
), cte4 as (
  select node, min(grp1) as grp1
  from rcte
  group by node
)
select DENSE_RANK() OVER (PARTITION BY null ORDER BY grp1) as grp, node
from cte4
order by grp, node;

结果:

grp | node
----|------
  1 | rec_1
  1 | rec_2
  1 | rec_3
  1 | rec_4
  2 | rec_5
  2 | rec_7

演示:https://www.db-fiddle.com/f/wqhoqoNGEfZvFpVBHybUVx/0