具有重复记录的mysql表

时间:2013-12-02 12:50:11

标签: mysql

我有一张桌子

email(email varchar(30),id integer(10),duplicated varchar(10))

有记录

    sai@gmail.com      101   null  
    kiran@gmail.com    102   null  
    sai123@gmail.com   103   null  
    sai@gmail.com      101   null  
    kiran@gmail.com    102   null  

现在我的问题是我需要在第二次重复记录的重复列中得到“是”。所以,输出表应该是

    sai@gmail.com      101   null  
    kiran@gmail.com    102   null  
    sai123@gmail.com   103   null  
    sai@gmail.com      101   yes  
    kiran@gmail.com    102   yes  

2 个答案:

答案 0 :(得分:1)

试试这个

update email set duplicated =
    (case when (select count(*) from email x where x.email = e.email) > 1 then "yes" else null)

编辑:这将更新表

答案 1 :(得分:0)

您可以尝试此查询进行查看:

select numerated.email, numerated.id, (case when cnt=1 OR numerated.rnum=grouped.min_rnum then null else "yes" end) as duplicated
from 
    (select @i := @i + 1 as rnum, email.* from email, (select @i:=0) as c order by id) as numerated
left join 
    (select email, id, min(rnum) as min_rnum, count(rnum) as cnt 
        from (select @i := @i + 1 as rnum, email.* from email, (select @i:=0) as c order by id) as numerated 
        group by email, id
    ) as grouped
on numerated.email=grouped.email and numerated.id=grouped.id
order by id;

你能详细解释一下你的情况吗?看起来它需要另一个解决方案,而不仅仅是SELECT查询。

试试这个更新:

update email u, (select @i:=0) urnum
set
  id = id + (@i:=@i + 1) - @i,
  duplicated = (
    select duplicated from (
        select 
            numerated.email, 
            numerated.id, 
            (case when cnt=1 OR numerated.rnum=grouped.min_rnum then null else "yes" end) as duplicated, 
            rnum
        from
            (select @i := @i + 1 as rnum, email.* from email, (select @i:=0) as c ) as numerated
        left join
            (select email, id, min(rnum) as min_rnum, count(rnum) as cnt
                from (select @i := @i + 1 as rnum, email.* from email, (select @i:=0) as c ) as numerated
                group by email, id
            ) as grouped
        on numerated.email=grouped.email and numerated.id=grouped.id
        order by rnum
    ) found_duplicates
    where u.email=found_duplicates.email and u.id=found_duplicates.id and @i=found_duplicates.rnum
    limit 1
  )
;

它看起来很有效,但你不应该依赖它。

如果可能,你应该这样做:
1.更改表格结构 - 添加唯一字段
2.更改表填充逻辑 - 在插入新行之前检查唯一性并使用正确的“重复”字段值插入它;
3.通过这样的临时表重新填充:

CREATE TEMPORARY TABLE tmp_email AS <... 'SELECT' version of my query ...>;  
TRUNCATE TABLE email;  
INSERT INTO email SELECT * FROM tmp_email;