Oracle Update基于其他2个表的连接

时间:2016-04-21 15:16:21

标签: sql oracle

我需要使用基于与第三个表的连接的表中的键来更新一个巨大的表,> 10亿条记录(POS数据)。我可以根据日期打破更新,因为这可以追溯到几年前。 我基本上需要将表edw.f_pos_daily中的f.retail_sku_key替换为dedup.retail_sku_key,当它们不相同时。 谢谢!

select  F.POS_KEY, f.retail_sku_key , dedup.retail_sku_key dedup_key 
from edw.f_pos_daily f,edw.d_retail_sku sku, edw.d_retail_sku_new dedup
where f.retail_sku_key = sku.retail_sku_key
and sku.retail_sku = dedup.retail_sku
and sku.mtd_item_number = dedup.mtd_item_number
and sku.retailer = dedup.retailer
and f.retail_sku_key <> dedup.retail_sku_key

1 个答案:

答案 0 :(得分:0)

虽然可能是UPDATE等价物,但我更喜欢在SQL语句驱动需要更新的行时使用MERGE 生成要同时更新的值。< / p>

那么,这样的事情呢? (我假设f.pos_key是f_pos_daily表上的唯一标识符。如果不是这种情况,并且查询为同一个f_pos_key值返回多行,则会失败。)

MERGE INTO edw.f_pos_daily f_main
USING (
select f.pos_key -- this is for joining back to the rows that need to be updated...
     , dedup.retail_sku_key dedup_key -- ...and this is the value to update them with
  from edw.f_pos_daily f
     , edw.d_retail_sku sku
     , edw.d_retail_sku_new dedup
 where f.retail_sku_key = sku.retail_sku_key
   and sku.retail_sku = dedup.retail_sku
   and sku.mtd_item_number = dedup.mtd_item_number
   and sku.retailer = dedup.retailer
   and f.retail_sku_key <> dedup.retail_sku_key 
    ) qry
ON (f_main.pos_key = qry.pos_key)
WHEN MATCHED THEN
   UPDATE SET f_main.retail_sku_key = qry.dedup_key
;

如果您确实需要将其分解为单独的更新,您可以通过两种方式分享:

1)在内部查询中隔离f_pos_daily中的分区(假设该表由除retail_sku_key之外的其他内容分区),例如FROM edw.f_pos_daily PARTITION (p_some_partition_name)并为每个分区运行上述语句

2)生成要更新的行范围(同样,使用f_pos_key = unique假设),这些行将更新,例如,每个行的10%:

SELECT MIN(f_pos_key) c0,
PERCENTILE_DISC(0.1) WITHIN GROUP (ORDER BY f_pos_key) p1,
PERCENTILE_DISC(0.2) WITHIN GROUP (ORDER BY f_pos_key) p2,
PERCENTILE_DISC(0.3) WITHIN GROUP (ORDER BY f_pos_key) p3,
PERCENTILE_DISC(0.4) WITHIN GROUP (ORDER BY f_pos_key) p4,
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY f_pos_key) p5,
PERCENTILE_DISC(0.6) WITHIN GROUP (ORDER BY f_pos_key) p6,
PERCENTILE_DISC(0.7) WITHIN GROUP (ORDER BY f_pos_key) p7,
PERCENTILE_DISC(0.8) WITHIN GROUP (ORDER BY f_pos_key) p8,
PERCENTILE_DISC(0.9) WITHIN GROUP (ORDER BY f_pos_key) p9,
MAX(f_pos_key) c4
FROM edw.f_pos_daily;

如果值介于0和1000之间(以及某些未知行数),这将为您提供如下输出:

P0  P1  P2  P3  P4  P5  P6  P7  P8  P9  P10
0   104 183 319 402 512 607 723 810 914 1000

从这里你只需要在子查询中包含另一个条件

AND f.pos_key BETWEEN 0 AND 104

在第一次运行时

AND f.pos_key BETWEEN 105 AND 183

第二次运行,依此类推。