我想清理一个表中有很多重复记录的表,在该表中,每个客户编号都有很多记录,它们的eff_dt(列名)不同。
我只希望每个客户号码保留一个记录。
为此,我将仅使用具有最小eff_dt的cust_nbr记录作为参考。因此,对于表中的每个cust_nbr,我只想复制游标上具有最小eff_dt的记录,然后将该游标值与表中的其余记录进行比较。
我在创建游标时使用了以下查询:
select cust_nbr, min(eff_dt), name, address from cust;
但这给了我以下错误:
[错误]执行(1:8):ORA-00937:不是单组分组功能
请帮助我
答案 0 :(得分:2)
您得到的错误意味着未聚合的列应成为GROUP BY
子句的一部分,即
select cust_nbr, min(eff_dt), name, address
from cust
group by cust_nbr, name, address;
P.S。请注意,逐行删除重复(在光标循环中)是 slow by by-slow 。您最好切换到某种 set 处理。一个简单的是:
delete from cust
where (cust_nbr,
eff_dt,
name,
address) not in ( select cust_nbr,
min (eff_dt),
name,
address
from cust
group by cust_nbr, name, address);
答案 1 :(得分:1)
我不确定游标逻辑是做什么的。我会简单地删除重复项:
delete cust
where rowid in
( select lead(rowid) over (partition by cust_nbr order by eff_dt)
from cust c );
答案 2 :(得分:0)
以下应为您工作:
DELETE cust c
WHERE EXISTS (SELECT 1 FROM cust
WHERE cust_nbr = c.cust_nbr
AND name = c.name
AND address = c.address
AND eff_dt < c.eff_dt)
答案 3 :(得分:0)
如果我正确理解,您有两组数据要删除:
eff_dt
被更改,其他所有相同。在这种情况下,可以使用两个分析函数来查找客户数据中最新更改的最小日期:
create table test_tab(id number, eff_dt date, name varchar2(20), address varchar2(50));
insert into test_tab values (1, to_date('01-jul-2018', 'dd-mon-yyyy'), 'Name 1', 'Address 1');
insert into test_tab values (1, to_date('15-jul-2018', 'dd-mon-yyyy'), 'Name 1', 'Address 1');
insert into test_tab values (1, to_date('01-aug-2018', 'dd-mon-yyyy'), 'Name 1 changed', 'Address 1 changed');
insert into test_tab values (1, to_date('05-aug-2018', 'dd-mon-yyyy'), 'Name 1 changed', 'Address 1 changed');
insert into test_tab values (1, to_date('10-aug-2018', 'dd-mon-yyyy'), 'Name 1 changed', 'Address 1 changed');
insert into test_tab values (2, to_date('12-jul-2018', 'dd-mon-yyyy'), 'Name 2', 'Address 2');
insert into test_tab values (2, to_date('18-jul-2018', 'dd-mon-yyyy'), 'Name 2', 'Address 2');
insert into test_tab values (3, to_date('15-jul-2018', 'dd-mon-yyyy'), 'Name 3', 'Address 3');
insert into test_tab values (3, to_date('18-jul-2018', 'dd-mon-yyyy'), 'Name 3 changed', 'Address 3 changed');
insert into test_tab values (3, to_date('25-jul-2018', 'dd-mon-yyyy'), 'Name 3 changed again', 'Address 3 changed again');
insert into test_tab values (3, to_date('12-aug-2018', 'dd-mon-yyyy'), 'Name 3 changed again', 'Address 3 changed again');
select id, eff_dt, name, address, -- rn, min_eff_dt
from (select id, eff_dt, name, address, -- min_eff_dt,
row_number() over (partition by id order by min_eff_dt desc) rn -- we need the highest minimum date - that is the date when last change in data took place (apart from eff_dt)
from (select id, eff_dt, name, address,
min(eff_dt) over (partition by id, name, address order by eff_dt) min_eff_dt -- minium dates of the customer's data changes
from test_tab))
where rn = 1;
您可以通过删除where rn = 1
并将min_eff_dt
添加到第二个select语句并将rn, min_eff_dt
添加到最上方的select语句来测试脚本,以便查看分析函数的结果。
您可以像威廉姆斯(William)的回复一样使用delete
:
delete from test_tab
where rowid in
(select rowid
from (select row_number() over (partition by id order by min_eff_dt desc) rn -- we need the highest minimum date - that is the date when last change in data took place (apart from eff_dt)
from (select id,
min(eff_dt) over (partition by id, name, address order by eff_dt) min_eff_dt -- minium dates of the customer's data changes
from test_tab))
where rn > 1);