消除表中的排列

时间:2012-03-11 00:50:37

标签: mysql sql

这是我一直试图解决的SQL问题,但我没有 到目前为止能够解决:

假设我有一张桌子:

序列(number1 int,number2 int,number3 int,number4 int,number5 int)

如果序列中存在一行,例如: < 1,3,4,2,5> 然后我想消除每一行的排列,这是一个排列, 例如行: < 1,2,5,4,3>

编辑: 主键是(number1,number2,number3,number4,number5)

1 个答案:

答案 0 :(得分:1)

这假设值不能在五列中重复,并且表有一列primary_key -

DELETE t2
FROM table t1
INNER JOIN table t2
    ON (t1.col1 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
        AND t1.col2 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
        AND t1.col3 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
        AND t1.col4 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
        AND t1.col5 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
    )
    AND t1.primary_key < t2.primary_key
    -- AND CONCAT(t1.col1, t1.col2, t1.col3, t1.col4, t1.col5) < CONCAT(t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
WHERE t1.col1 NOT IN (t1.col2, t1.col3, t1.col4, t1.col5)
AND t1.col2 NOT IN (t1.col3, t1.col4, t1.col5)
AND t1.col3 NOT IN (t1.col4, t1.col5)
AND t1.col4 <> t1.col5

我没试过这个,所以我建议在提交DELETE之前将它作为SELECT运行。

UPDATE 以下查询适用于集合中重复值的情况(1,1,2,2,2而不是1,2,3,4,5)但是连接非常昂贵,因此在针对非常大的数据集运行时我会非常谨慎。

DELETE t2
FROM `table` t1
INNER JOIN `table` t2
    ON (    t1.col1 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
        AND t1.col2 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
        AND t1.col3 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
        AND t1.col4 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
        AND t1.col5 IN (t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)
    )
    AND (-- compare the number of occurrences of each value in each side
            (IF(t1.col1=t1.col1, 1, 0)+IF(t1.col1=t1.col2, 1, 0)+IF(t1.col1=t1.col3, 1, 0)+IF(t1.col1=t1.col4, 1, 0)+IF(t1.col1=t1.col5, 1, 0)) = (IF(t1.col1=t2.col1, 1, 0)+IF(t1.col1=t2.col2, 1, 0)+IF(t1.col1=t2.col3, 1, 0)+IF(t1.col1=t2.col4, 1, 0)+IF(t1.col1=t2.col5, 1, 0))
        AND (IF(t1.col2=t1.col1, 1, 0)+IF(t1.col2=t1.col2, 1, 0)+IF(t1.col2=t1.col3, 1, 0)+IF(t1.col2=t1.col4, 1, 0)+IF(t1.col2=t1.col5, 1, 0)) = (IF(t1.col2=t2.col1, 1, 0)+IF(t1.col2=t2.col2, 1, 0)+IF(t1.col2=t2.col3, 1, 0)+IF(t1.col2=t2.col4, 1, 0)+IF(t1.col2=t2.col5, 1, 0))
        AND (IF(t1.col3=t1.col1, 1, 0)+IF(t1.col3=t1.col2, 1, 0)+IF(t1.col3=t1.col3, 1, 0)+IF(t1.col3=t1.col4, 1, 0)+IF(t1.col3=t1.col5, 1, 0)) = (IF(t1.col3=t2.col1, 1, 0)+IF(t1.col3=t2.col2, 1, 0)+IF(t1.col3=t2.col3, 1, 0)+IF(t1.col3=t2.col4, 1, 0)+IF(t1.col3=t2.col5, 1, 0))
        AND (IF(t1.col4=t1.col1, 1, 0)+IF(t1.col4=t1.col2, 1, 0)+IF(t1.col4=t1.col3, 1, 0)+IF(t1.col4=t1.col4, 1, 0)+IF(t1.col4=t1.col5, 1, 0)) = (IF(t1.col4=t2.col1, 1, 0)+IF(t1.col4=t2.col2, 1, 0)+IF(t1.col4=t2.col3, 1, 0)+IF(t1.col4=t2.col4, 1, 0)+IF(t1.col4=t2.col5, 1, 0))
        AND (IF(t1.col5=t1.col1, 1, 0)+IF(t1.col5=t1.col2, 1, 0)+IF(t1.col5=t1.col3, 1, 0)+IF(t1.col5=t1.col4, 1, 0)+IF(t1.col5=t1.col5, 1, 0)) = (IF(t1.col5=t2.col1, 1, 0)+IF(t1.col5=t2.col2, 1, 0)+IF(t1.col5=t2.col3, 1, 0)+IF(t1.col5=t2.col4, 1, 0)+IF(t1.col5=t2.col5, 1, 0))
    )
    AND t1.primary_key < t2.primary_key
    -- AND CONCAT(t1.col1, t1.col2, t1.col3, t1.col4, t1.col5) < CONCAT(t2.col1, t2.col2, t2.col3, t2.col4, t2.col5)

如果您没有表的单列主键,则可以使用注释掉的比较而不是PK,但PK绝对是首选。