用于从数据库中删除重复项的脚本

时间:2015-01-12 17:22:07

标签: php sql

我有数据库,我有数据,如phonenumber,orgno,companyname,postalcode。

有些重复项也在我的数据库中,如同公司名称,相同的orgno,相同的邮政编码但不同的电话号码。所以我想通过使用脚本删除这些重复项。我该怎么写这种脚本?

CREATE TABLE TEST (
    address_id        bigint,
    name              varchar(255),
    marketing_name    varchar(255),
    co_address        varchar(255),
    box_address1      varchar(255),
    box_address2      varchar(255),
    box_postal_code   varchar(45),
    visit_address1    varchar(255),
    org_nr            varchar(255),
    visit_postal_code varchar(255),
    county_id         varchar(255),
    contact           varchar(255),
    complete          tinyint,
    deleted           tinyint,
    district_id       varchar(255),
    region            varchar(255),
    phone1            varchar(255),
    mobile_phone      varchar(255),
)

1 个答案:

答案 0 :(得分:1)

首先,您需要确定要使用的重复条件是什么。例如,我想检查相同的Org Nro,公司名称和邮政编码中的电话号码是否重复。我可以这样做:

第一个测试表

CREATE TABLE TEST
(
id int,
phonenumber int,
orgno int,
companyname varchar(100),
postalcode varchar(4)
)

然后测试数据

  insert into TEST
 values (1,4533660,1,'COMPANY 1',1234)
  insert into TEST
 values (2,4533660,1,'COMPANY 1',1234)
  insert into TEST
 values (3,954189547,1,'COMPANY 2',4444)
   insert into TEST
 values (4,954189547,1,'COMPANY 2',4444)
   insert into TEST
 values (5,3652591,1,'COMPANY 3',4444)
   insert into TEST
 values (6,4201580,1,'COMPANY 4',4444)
   insert into TEST
 values (7,3337788,1,'COMPANY 5',4444)

最后,查询重复项,并选择标准。

 Select phonenumber,
 orgno,
 companyname,
 postalcode,
 COUNT(*)
 from test 
 group by phonenumber,
 orgno, 
 companyname,
 postalcode 
 HAVING COUNT(*) > 1

使用此查询,您可以轻松找到重复项,最重要的是查看重复的标准。

<强> 修改

如果要删除除1以外的所有重复行,可以执行以下操作:

DELETE A
FROM TEST A 
INNER JOIN 
(
        SELECT ROW_NUMBER()OVER(PARTITION BY a.phonenumber,
        a.orgno,
        a.companyname,
        a.postalcode ORDER BY a.id)AS POS,
        a.phonenumber,
        a.orgno,
        a.companyname,
        a.postalcode,
        a.id
        FROM TEST A
        JOIN 
        (
                SELECT  
                phonenumber,
                orgno,
                companyname,
                postalcode,
                COUNT(*) AS CONTADOR
                FROM test 
                GROUP BY phonenumber,
                orgno, 
                companyname,
                postalcode
                HAVING COUNT(*) > 1
        ) TB
        ON A.companyname = TB.companyname 
        AND A.orgno = TB.orgno 
        AND A.phonenumber = TB.phonenumber 
        AND A.postalcode = TB.postalcode
) TB_2
ON
a.id = tb_2.id 
and tb_2.pos > 1
  

使用提供的数据结构编辑脚本。

DELETE FROM TEST B
JOIN 
(
    SELECT ROW_NUMBER()OVER(PARTITION BY
    a.org_nr,
    a.phone1,
    a.phone2,
    a.name,
    a.marketing_name
    ORDER BY 
    a.org_nr,
    a.phone1,
    a.phone2,
    a.name,
    a.marketing_name
    )AS POS,
    a.address_id,
    a.org_nr,
    a.phone1,
    a.phone2,
    a.name,
    a.marketing_name
    FROM TEST A
    JOIN 
    (
        SELECT 
        org_nr,
        phone1,
        phone2,
        name,
        marketing_name
        FROM TEST 
        GROUP BY 
        org_nr,
        phone1,
        phone2,
        name,
        marketing_name
        HAVING COUNT(*) > 1) TB
        ON a.org_nr = tb.org_nr
        and a.phone1 = tb.phone1
        and a.phone2 = tb.phone2
        and a.name = tb.name
        and a.marketing_name = tb.marketing_name
    ) TB_2
ON
b.address_id = tb_2.address_id
and tb_2.pos > 1