如何改进这种消除排列的SQL代码?

时间:2012-04-20 00:58:59

标签: mysql sql

问题陈述如下:

-One有一个带结构的表INIT

(number1 INT not null, number2 INT not null, ..., number7 INT not null)

- 我想在表'tab'中插入表INIT的所有行但我不想要 在'tab'中有2行,这样一个是另一个的排列。所以,例如, if(1,2,3,7,19,21,6)和(19,2,3,7,1,21,6)是INIT中的行,那么只有一行 他们必须最终进入“标签”。它们中的哪一个最终出现在“标签”中并不重要。

- 我的代码如下所示:我保留一个辅助表'aux' INIT的结构相同。我迭代表INIT的所有行和每行 在INIT我按其组件的递增顺序排序,所以如果(1,2,3,7,19,21,6)是 INIT中的一行,我对它进行排序(1,2,3,6,7,19,21)并检查它是否在'aux'中。如果是 我继续下一行。否则,我在'tab'中插入(1,2,3,7,19,21,6)。

我在包含300,000行的INIT表上运行此程序,我估计 它需要7个多小时才能运行。我想知道如何改进 这个程序的运行时间。

DECLARE done BOOLEAN default 0;
DECLARE n1,n2,n3,n4,n5,n6,n7 INT;
DECLARE o1,o2,o3,o4,o5,o6,o7 INT;
DECLARE my_cursor cursor  FOR select * from INIT;
DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET done=1;       
OPEN my_cursor;

drop table if exists aux;
create table aux(
  number1 INT not null,
  number2 INT not null,
  number3 INT not null,
  number4 INT not null,
  number5 INT not null,
  number6 INT not null,
  number7 INT not null,
 );
 create table temp( number INT );

REPEAT
   truncate table temp;

   FETCH my_cursor INTO n1,n2,n3,n4,n5,n6,n7;
        INSERT INTO temp values(n1);
        INSERT INTO temp values(n2);    
        INSERT INTO temp values(n3);
        INSERT INTO temp values(n4);
        INSERT INTO temp values(n5);
        INSERT INTO temp values(n6);
        INSERT INTO temp values(n7);
        BEGIN
           DECLARE done2 BOOLEAN default 0;
           DECLARE my_cursor2 cursor  FOR select * from temp order by number;  
           OPEN my_cursor2;
             FETCH my_cursor2 INTO o1;
             FETCH my_cursor2 INTO o2;
             FETCH my_cursor2 INTO o3;
             FETCH my_cursor2 INTO o4;
             FETCH my_cursor2 INTO o5;
             FETCH my_cursor2 INTO o6;
             FETCH my_cursor2 INTO o7;

             IF NOT EXISTS (SELECT * FROM aux where number1=o1 AND number2=o2 AND number3=o3 
                            AND number4=o4 AND number5 = o5 AND number6 = o6 AND number7=o7 ) 
             THEN
                 INSERT INTO tab VALUES (n1,n2,n3,n4,n5,n6,n7);
             END IF;
           CLOSE my_cursor2;
         END;
UNTIL done END REPEAT;
CLOSE my_cursor;

EDITED: - 在INIT的每一行中,所有整数都是不同的。 - INIT的主键是(number1,number2,...,number7)

2 个答案:

答案 0 :(得分:1)

你正在为每一行做一个沉重的询问......这不是一个好方法。

相反,你可以使用一些数据库功夫来完成工作而不用存储过程:

insert into tab
select number1, number2, number3, number4, number5, number6, number7 from (
  select number1, number2, number3, number4, number5, number6, number7, 
    group_concat(number order by number) as sig from (
      select number1, number2, number3, number4, number5, number6, number7, number1 as number
      union all select number1, number2, number3, number4, number5, number6, number7, number2
      union all select number1, number2, number3, number4, number5, number6, number7, number3
      union all select number1, number2, number3, number4, number5, number6, number7, number4
      union all select number1, number2, number3, number4, number5, number6, number7, number5
      union all select number1, number2, number3, number4, number5, number6, number7, number6
      union all select number1, number2, number3, number4, number5, number6, number7, number7) a
) group by sig) b

这里涉及的关键技巧是:

  • 内部选择允许group_concat执行按标准顺序对数字进行分组的工作,以便可以比较组合
  • 带有订单的
  • group_concat会为您提供数字的唯一签名
  • 在mysql中使用group by 而不使用聚合为每个分组列提供第一个

BTW,正确的术语是组合而非排列

另外,我没有对此进行过测试,因此可能会有一个错位的括号等,但它应该“基本上”工作

答案 1 :(得分:0)

Mysqlism,使用GROUP_CONCAT进行比较:

create table p -- data source
(
  grp int auto_increment primary key, n1 int, n2 int, n3 int, n4 int, n5 int, n6 int, n7 int  
  );


insert into p(n1,n2,n3,n4,n5,n6,n7) 
select 1,2,3,7,19,21,6 union
select 19,2,3,7,1,21,6 union
select 20,2,3,7,1,21,6;


create table g -- staging table
(
  grp int,
  n int
);

insert into g(grp, n)
select grp, n
from
(
    select grp, n1 as n from p
    union all
    select grp, n2 from p
    union all
    select grp, n3 from p
    union all
    select grp, n4 from p
    union all
    select grp, n5 from p
    union all
    select grp, n6 from p
    union all
    select grp, n7 from p
) as x

独特的提取器:

select grp, n1, n2, n3, n4, n5, n6, n7
from p
where grp in
(
 select min(grp) as first_elem -- select only one among duplicates
 from
 ( 
    select grp, group_concat(n order by n) as comb
    from g
    group by grp
 ) as x
 group by comb
);

基本上,对数字进行排序,因此它们更适合进行组合比较,然后使用一些MySqlism来简化比较逻辑,即使用group_concat

实时测试:http://sqlfiddle.com/#!2/7b61f/1