Question

我想清理一个表中有很多重复记录的表，在该表中，每个客户编号都有很多记录，它们的eff_dt（列名）不同。

我只希望每个客户号码保留一个记录。

为此，我将仅使用具有最小eff_dt的cust_nbr记录作为参考。因此，对于表中的每个cust_nbr，我只想复制游标上具有最小eff_dt的记录，然后将该游标值与表中的其余记录进行比较。

我在创建游标时使用了以下查询：

select cust_nbr, min(eff_dt), name, address from cust;

但这给了我以下错误：

[错误]执行（1：8）：ORA-00937：不是单组分组功能

请帮助我

Answer 1

您得到的错误意味着未聚合的列应成为GROUP BY子句的一部分，即

select cust_nbr, min(eff_dt), name, address
from cust
group by cust_nbr, name, address;

P.S。请注意，逐行删除重复（在光标循环中）是 slow by by-slow 。您最好切换到某种 set 处理。一个简单的是：

delete from cust
      where (cust_nbr,
             eff_dt,
             name,
             address) not in (  select cust_nbr,
                                       min (eff_dt),
                                       name,
                                       address
                                  from cust
                              group by cust_nbr, name, address);

Answer 2

我不确定游标逻辑是做什么的。我会简单地删除重复项：

delete cust
where  rowid in
       ( select lead(rowid) over (partition by cust_nbr order by eff_dt)
         from   cust c );

Answer 3

以下应为您工作：

DELETE cust c
 WHERE EXISTS (SELECT 1 FROM cust
                WHERE cust_nbr = c.cust_nbr
                  AND name     = c.name
                  AND address  = c.address
                  AND eff_dt   < c.eff_dt)

Answer 4

如果我正确理解，您有两组数据要删除：

所有更改了客户数据（姓名，地址...）的行在最新更改之前；
所有行在之后，其中仅eff_dt被更改，其他所有相同。

在这种情况下，可以使用两个分析函数来查找客户数据中最新更改的最小日期：

create table test_tab(id number, eff_dt date, name varchar2(20), address varchar2(50));

insert into test_tab values (1, to_date('01-jul-2018', 'dd-mon-yyyy'), 'Name 1', 'Address 1');
insert into test_tab values (1, to_date('15-jul-2018', 'dd-mon-yyyy'), 'Name 1', 'Address 1');
insert into test_tab values (1, to_date('01-aug-2018', 'dd-mon-yyyy'), 'Name 1 changed', 'Address 1 changed');
insert into test_tab values (1, to_date('05-aug-2018', 'dd-mon-yyyy'), 'Name 1 changed', 'Address 1 changed');
insert into test_tab values (1, to_date('10-aug-2018', 'dd-mon-yyyy'), 'Name 1 changed', 'Address 1 changed');
insert into test_tab values (2, to_date('12-jul-2018', 'dd-mon-yyyy'), 'Name 2', 'Address 2');
insert into test_tab values (2, to_date('18-jul-2018', 'dd-mon-yyyy'), 'Name 2', 'Address 2');
insert into test_tab values (3, to_date('15-jul-2018', 'dd-mon-yyyy'), 'Name 3', 'Address 3');
insert into test_tab values (3, to_date('18-jul-2018', 'dd-mon-yyyy'), 'Name 3 changed', 'Address 3 changed');
insert into test_tab values (3, to_date('25-jul-2018', 'dd-mon-yyyy'), 'Name 3 changed again', 'Address 3 changed again');
insert into test_tab values (3, to_date('12-aug-2018', 'dd-mon-yyyy'), 'Name 3 changed again', 'Address 3 changed again');

select id, eff_dt, name, address, -- rn, min_eff_dt
  from (select id, eff_dt, name, address, -- min_eff_dt,
               row_number() over (partition by id order by min_eff_dt desc) rn -- we need the highest minimum date - that is the date when last change in data took place (apart from eff_dt)
          from (select id, eff_dt, name, address,
                       min(eff_dt) over (partition by id, name, address order by eff_dt) min_eff_dt -- minium dates of the customer's data changes
                  from test_tab))
 where rn = 1;

您可以通过删除where rn = 1并将min_eff_dt添加到第二个select语句并将rn, min_eff_dt添加到最上方的select语句来测试脚本，以便查看分析函数的结果。

您可以像威廉姆斯（William）的回复一样使用delete：

delete from test_tab
 where rowid in
         (select rowid
            from (select row_number() over (partition by id order by min_eff_dt desc) rn -- we need the highest minimum date - that is the date when last change in data took place (apart from eff_dt)
                    from (select id,
                                 min(eff_dt) over (partition by id, name, address order by eff_dt) min_eff_dt -- minium dates of the customer's data changes
                            from test_tab))
           where rn > 1);

如何在Oracle中使用游标从表中删除重复的行？

4 个答案: