MySQL表上的数字重复记录

时间:2016-01-12 12:33:27

标签: mysql set-based

拥有类似架构的表

id control code amount 
1   200     12  300
2   400     12  300
3   200     12  300
4   100     10  400
5   100     10  400
6   500     13  500

尝试在UI上列出重复的记录。

使用以下查询,我可以检索重复记录并在UI上显示。

select * from mwt group by control,code,amount having count(id) > 1;  

id control code amount 
1   200     12  300
4   100     10  400

此处,id为1和4的记录分别为3和5的重复。

在用户界面上,用户将单击记录旁边的复选框,并应在UI中填充相应的重复记录。更容易尝试填充另一个名为dup_id的列。使用此dup_id可以过滤来自UI的结果,该结果采用JSON格式。

如何创建类似于下图所示的结果集?

id control code amount dup_id
1   200     12  300     1
2   400     12  300
3   200     12  300     1
4   100     10  400     4
5   100     10  400     4
6   500     13  500

3 个答案:

答案 0 :(得分:1)

根据订单的准确程度,您可以执行以下操作。

这将获得带有计数的所有唯一控件/代码/数量,以获得一个标志,以确定它是否是重复行,并按控制/代码/数量排序,以便它们按顺序排列。它进行交叉连接以初始化一些用户变量。

然后它计算一个计数器,只有在任何控制/代码/数量发生变化且它是重复行时才递增计数器。然后设置用户变量以存储控制/代码/金额的先前值。

外部查询然后将结果命令回到id顺序。

SELECT sub3.id, 
        sub3.control, 
        sub3.code, 
        sub3.amount, 
        sub3.dup_id
FROM
(
    SELECT sub2.id, 
            sub2.control, 
            sub2.code, 
            sub2.amount, 
            @cnt:=IF(@control=control AND @code=code AND @amount=amount AND sub2.id_count IS NOT NULL, @cnt, IF(sub2.id_count IS NULL, @cnt, @cnt + 1)),
            @control:=control,
            @code:=code,
            @amount:=amount,
            IF(sub2.id_count IS NULL, NULL, @cnt) AS dup_id
    FROM
    (
        SELECT mwt.id, mwt.control, mwt.code, mwt.amount, sub1.id_count 
        FROM mwt
        LEFT OUTER JOIN
        (
            SELECT control, code, amount, COUNT(id) AS id_count
            FROM mwt 
            GROUP BY control,code,amount 
            HAVING id_count > 1
        ) sub1
        ON mwt.control = sub1.control
        AND mwt.code = sub1.code
        AND mwt.amount = sub1.amount
        ORDER BY mwt.control, mwt.code, mwt.amount
    ) sub2
    CROSS JOIN
    (
        SELECT @cnt:=0, @control:=0, @code:=0, @amount:=0
    ) sub0
) sub3
ORDER BY id

请注意,这是按控制,代码和金额排序,因此不是您所需输出的完全匹配(这需要首先按ID排序第一个副本)。

编辑 - 更简单,更好的方法。这将获得具有这些重复项的最小ID的所有重复行(按最小ID排序),并使用用户变量为这些行添加序列号。然后LEFT OUTER JOIN返回主表,将该序列号放在所有匹配的行中。

SELECT mwt.id, mwt.control, mwt.code, mwt.amount, sub2.dup_id 
FROM mwt
LEFT OUTER JOIN
(
    SELECT sub1.id, sub1.control, sub1.code, sub1.amount, @cnt:=@cnt+1 AS dup_id
    FROM 
    (
        SELECT MIN(id) AS id, control, code, amount
        FROM mwt 
        GROUP BY control,code,amount 
        HAVING COUNT(id) > 1
        ORDER BY id
    ) sub1
    CROSS JOIN
    (
        SELECT @cnt:=0
    ) sub0
) sub2
ON mwt.control = sub2.control
AND mwt.code = sub2.code
AND mwt.amount = sub2.amount
ORDER BY mwt.id

答案 1 :(得分:1)

这似乎比@kickstarter提出的解决方案更简单 - 但也许我误解了这个要求......

SELECT x.*
     , y.dup_id 
  FROM my_table x 
  LEFT 
  JOIN
     ( SELECT MIN(id) dup_id
            , control
            , code
            , amount 
         FROM my_table 
        GROUP 
           BY control
            , code
            , amount 
       HAVING COUNT(*) > 1
     ) y
    ON y.control = x.control 
   AND y.code = x.code 
   AND y.amount = x.amount;

答案 2 :(得分:-1)

您需要dup_id列吗?我希望这可以通过下面的简单查询来实现

select id
     , control
     , code
     , amount 
  from table 
 where control = from selected Record 
   and code = from selected Record 
   and amount = from selected Record 
   and id not equals from selected Record

如果要求列出包括所选记录在内的重复项,则可以很好地省略最后的不等于。