Question

我有一个如下数据集：

Identifier | Revenue | Good Inflow
-----------------------------------
  abc123   |   20    |  15
  abc124   |   10    |   5
  abc124   |    5    |   5

正如您所看到的，有两条线具有相同的标识符，不同的收入但相同（冗余）良好的流入量。我想要实现的是消除具有相同标识符的第二行，但是将收入添加到第一行的收入中。因此结果应该是：

abc124 | 15 | 5

这可能吗？如果是这样，我需要哪个命令？我使用Oracle SQL Developer。

提前谢谢！

菲利克斯

Answer 1

您可以通过合并执行此操作。使用虚拟表来模拟您的数据：

create table t42 (identifier varchar2(10), revenue number, good_inflow number);
insert into t42 values ('abc123', 20, 15);
insert into t42 values ('abc124', 10, 5);
insert into t42 values ('abc124', 5, 5);

您可以使用分析函数将总数作为每行的额外列，以及伪行号：

select t42.*,
  sum(revenue) over (partition by identifier, good_inflow) as total_revenue,
  row_number() over (partition by identifier, good_inflow order by rowid) as rn,
  count(*) over (partition by identifier, good_inflow) as cnt
from t42;

IDENTIFIER REVENUE GOOD_INFLOW TOTAL_REVENUE   RN  CNT
---------- ------- ----------- ------------- ---- ----
abc123          20          15            20    1    1
abc124          10           5            15    1    2
abc124           5           5            15    2    2

然后，您可以将其用作using语句的merge子句，将基础表的rowid添加为on条件，并使用生成的{ {1}}仅更新具有重复项的那些（因此任何只出现一次的ID /流入对不会毫无意义地更新为相同的值）：

cnt

merge into (select t42.*, rowid from t42) t42 using ( select t42.*, rowid, sum(revenue) over (partition by identifier, good_inflow) as total_revenue, row_number() over (partition by identifier, good_inflow order by rowid) as rn, count(*) over (partition by identifier, good_inflow) as cnt from t42 ) tmp on (tmp.rowid = t42.rowid and tmp.cnt > 1) when matched then update set revenue = case when tmp.rn = 1 then tmp.total_revenue else null end delete where (revenue is null); 2 rows merged.子句适用于更新后的值;我已经使用case表达式将除了每个组合的名义第一行以外的所有收入设置为null。现在可以删除那些null。

delete

但是，这确实假设收入值无论如何都不能为空。如果它可以为空，那么它可以适应它。

在删除冗余条目之前添加值

1 个答案: