如何使用Amazon redshift正确编写此查询?

时间:2019-04-29 21:17:36

标签: sql amazon-redshift

我想编写一个更新查询来更新表中列的计数,但是我不知道如何实现。我将其范围缩小到三个选项,但在某些或其他问题上我仍然继续运行。哪种方法正确并对其进行正确的查询?

update fact_spv_commissioned_lot
set sn_count = fact_spv_commissioned_lot.sn_count + 
(
  SELECT COUNT(*) FROM staging_serials s
  JOIN dim_md_company c ON (c.lsc_company_id = s.companyid)
  JOIN staging_product p ON (s.compositeproductcode = p.compositeproductcode)
  JOIN dim_packaging_level l ON (l.unit_of_measure = p.packaginguom)
  JOIN fact_spv_commissioned_lot f ON (f.sk_company_id = s.companyid)
  WHERE c.sk_company_id = f.sk_company_id
  AND s.lotnumber = f.lot_number
  AND p.sk_product_id = f.sk_product_id
  AND l.sk_packaging_level_id = f.sk_packaging_level_id
)

这是编写它的正确方法吗?

update fact_spv_commissioned_lot
set sn_count = fact_spv_commissioned_lot.sn_count + 
(
  SELECT COUNT(*) FROM staging_serials s
  JOIN dim_md_company c ON (c.lsc_company_id = s.companyid)
  JOIN staging_product p ON (s.compositeproductcode = p.compositeproductcode)
  JOIN dim_packaging_level l ON (l.unit_of_measure = p.packaginguom)
  JOIN fact_spv_commissioned_lot f ON (f.sk_company_id = s.companyid)
  WHERE c.sk_company_id = f.sk_company_id
  AND s.lotnumber = f.lot_number
  AND p.sk_product_id = f.sk_product_id
  AND l.sk_packaging_level_id = f.sk_packaging_level_id
)
FROM staging_serials s
  JOIN dim_md_company c ON (c.lsc_company_id = s.companyid)
  JOIN staging_product p ON (s.compositeproductcode = p.compositeproductcode)
  JOIN dim_packaging_level l ON (l.unit_of_measure = p.packaginguom)
  JOIN fact_spv_commissioned_lot f ON (f.sk_company_id = s.companyid)
  WHERE c.sk_company_id = f.sk_company_id
  AND s.lotnumber = f.lot_number
  AND p.sk_product_id = f.sk_product_id
  AND l.sk_packaging_level_id = f.sk_packaging_level_id

这是编写它的正确方法吗?

update fact_spv_commissioned_lot
set sn_count = fact_spv_commissioned_lot.sn_count + 
(
  SELECT COUNT(*) FROM staging_serials s
  JOIN dim_md_company c ON (c.lsc_company_id = s.companyid)
  JOIN staging_product p ON (s.compositeproductcode = p.compositeproductcode)
  JOIN dim_packaging_level l ON (l.unit_of_measure = p.packaginguom)
  JOIN fact_spv_commissioned_lot f ON (f.sk_company_id = s.companyid)
)
  WHERE c.sk_company_id = f.sk_company_id
  AND s.lotnumber = f.lot_number
  AND p.sk_product_id = f.sk_product_id
  AND l.sk_packaging_level_id = f.sk_packaging_level_id

1 个答案:

答案 0 :(得分:1)

我个人很喜欢CTE's,但是您的第一个查询几乎就在那里。

CTE版本如下(请用实际的主键列替换<pk-col>):

WITH
    agg_data (pk, count) AS (
        SELECT f.<pk-col>, COUNT(*)
        FROM staging_serials s
            JOIN dim_md_company c ON (c.lsc_company_id = s.companyid)
            JOIN staging_product p ON (s.compositeproductcode = p.compositeproductcode)
            JOIN dim_packaging_level l ON (l.unit_of_measure = p.packaginguom)
            JOIN fact_spv_commissioned_lot f ON (f.sk_company_id = s.companyid)
        WHERE c.sk_company_id = f.sk_company_id
            AND s.lotnumber = f.lot_number
            AND p.sk_product_id = f.sk_product_id
            AND l.sk_packaging_level_id = f.sk_packaging_level_id
        GROUP BY 1
    )
UPDATE fact_spv_commissioned_lot AS to_update
SET sn_count = sn_count + agg_data.count
FROM agg_data WHERE agg_data.pk = to_update.<pk-col>;

作为替代方案,您还可以使用子选择中与表fact_spv_commissioned_lot相关的原始联接列来弥补已删除的JOINf)的相关性,例如:

WITH
    agg_data (sk_company_id, lot_number, sk_product_id, sk_packaging_level_id, count) AS (
        SELECT f.sk_company_id, f.lot_number, f.sk_product_id, f.sk_packaging_level_id, COUNT(*)
        FROM staging_serials s
            JOIN dim_md_company c ON (c.lsc_company_id = s.companyid)
            JOIN staging_product p ON (s.compositeproductcode = p.compositeproductcode)
            JOIN dim_packaging_level l ON (l.unit_of_measure = p.packaginguom)
            JOIN fact_spv_commissioned_lot f ON (f.sk_company_id = s.companyid)
        WHERE c.sk_company_id = f.sk_company_id
            AND s.lotnumber = f.lot_number
            AND p.sk_product_id = f.sk_product_id
            AND l.sk_packaging_level_id = f.sk_packaging_level_id
        GROUP BY 1, 2, 3, 4
    )
UPDATE fact_spv_commissioned_lot AS to_update
SET sn_count = sn_count + agg_data.count
FROM agg_data
WHERE agg_data.sk_company_id = to_update.sk_company_id
    AND agg_data.lot_number = to_update.lot_number
    AND agg_data.sk_product_id = to_update.sk_product_id
    AND agg_data.sk_packaging_level_id = to_update.sk_packaging_level_id
;

...或完全短一些的子选择样式:

UPDATE fact_spv_commissioned_lot AS to_update
SET sn_count = sn_count + (
    SELECT COUNT(*)
    FROM staging_serials s
        JOIN dim_md_company c ON (c.lsc_company_id = s.companyid)
        JOIN staging_product p ON (s.compositeproductcode = p.compositeproductcode)
        JOIN dim_packaging_level l ON (l.unit_of_measure = p.packaginguom)
    WHERE s.companyid = to_update.sk_company_id
        AND s.lotnumber = to_update.lot_number
        AND c.sk_company_id = to_update.sk_company_id
        AND p.sk_product_id = to_update.sk_product_id
        AND l.sk_packaging_level_id = to_update.sk_packaging_level_id
);

如果您的表是中型到大型(数百万至数十亿行),则CTE版本的性能也应更好(尤其是使用主键列的第一个变体),尽管在SQL中更为冗长。