Oracle PL SQL: Delete Duplicate Rows and Increment Duplicates in Another Column

时间:2015-07-31 19:36:59

标签: sql oracle plsql

I am inserting data into another table that will delete the duplicates and sum up the quantity for those duplicates in another column. I am new to PL SQL so I am having trouble building this table. What I have is 16 columns of data. The table I am pulling from and the table that I am inserting to have the same number of columns. I need to get rid of the duplicates according to 2 different columns. So if the data in c1 is "aaa" and the data in c2 is "bbb", I need to get rid of the rest of the columns that have the exact same data in the same places. There is also a quantity column that I need to sum up for specific c1 and c2. So the last one of the duplicates will have the values that all of the duplicates had in the quantity column, just summed up with the values that were deleted.

1 个答案:

答案 0 :(得分:0)

This should do it. Note -- it sets the "dup_count" column equal to the number of duplicates that were deleted. So, e.g., if there were three identical rows, the dup_count would be 2.

Also, there is a little ambiguity in your question. You say:

There is also a quantity column that I need to sum up for specific c1 and c2

... but you will still be left with multiple rows having identical values for c1 and c2, if the other columns were different. I understood your requirements to mean that all the columns needed to be identical, not just the c1 and c2 columns. If I misunderstood, just remove the extra columns from the PARTITION clauses.

merge into dup_table t
using ( SELECT rowid row_id, c1, c2, data1, data2, count(*) over ( partition by c1, c2, data1, data2) -1 dup_count, row_number() over ( partition by c1, c2, data1, data2  order by rowid) rn
               FROM dup_table ) u
on ( t.rowid = u.row_id )
when matched then  update set t.dup_count = u.dup_count  delete where u.rn > 1;         

Full script:

DROP TABLE dup_table;

CREATE TABLE dup_table ( c1 number, c2 number, data1 varchar2(10), data2 varchar2(10) );

INSERT INTO dup_table values ( 1, 1, 'A', 'A');
INSERT INTO dup_table values ( 1, 1, 'A', 'A');
INSERT INTO dup_table values ( 1, 1, 'A', 'A');
INSERT INTO dup_table values  ( 1, 1, 'B', 'C');
INSERT INTO dup_table  values ( 1, 1, 'B', 'C');
INSERT INTO dup_table values  ( 1, 2, 'A', 'A');
INSERT INTO dup_table values  ( 1, 2, 'A', 'A');
INSERT INTO dup_table values  ( 1, 2, 'A', 'A');
INSERT INTO dup_table  values ( 1, 2, 'B', 'C');
INSERT INTO dup_table values  ( 1, 2, 'B', 'C');
INSERT INTO dup_table values  ( 1, 3, 'A', 'A');

ALTER TABLE dup_table ADD (dup_count NUMBER DEFAULT 0 );

merge into dup_table t
using ( SELECT rowid row_id, c1, c2, data1, data2, count(*) over ( partition by c1, c2, data1, data2) -1 dup_count, row_number() over ( partition by c1, c2, data1, data2  order by rowid) rn
               FROM dup_table ) u
on ( t.rowid = u.row_id )
when matched then  update set t.dup_count = u.dup_count  delete where u.rn > 1;               

select* from dup_table;