我正在我们构建的成本报告工具中重建或AWS账单,我需要一些指导如何在Postgres中执行此类更新。
AWS账单位于表格' BillingData'并且每一行都是按照资源ID'每隔一小时。
例如,我们有
ResourceId|BlendedCost|user:Product|UsageStartDate
i-34r8uefg | 0.8763 |<null>|04-01-01 01:00
i-34r8uefg | 0.8763 |AwesomeProductTag|04-01-01 02:00
这表明在01小时,实例没有被标记,但是在02小时就是。我们有1000个像这样的行。
我想要做的是,哪一行有一列具有NULL数据的行&#34; user:Product&#34;,用该表中其他地方的数据填充该列,对于相同的& #34; RESOURCEID&#34 ;.
用更明确的术语来说,有人在他们创建&#39; i-34r8uefg&#39;没有正确标记它,但后来这样做了。我有以下查询,它给出了实例未在一小时标记但在不同时刻标记的行
select "ResourceId","user:Product" from billingdata
where "user:Product" NOTNULL
and "ResourceId" in
(select DISTINCT "ResourceId"
from billingdata
where "user:Product" ISNULL);
我想将"user:Product"
在一小时(行)处设置为null,并将其设置为稍后在表中存在的值。
答案 0 :(得分:0)
假设您的要求是:
ResourceId
user:Product
user:Product
...然后你可以用这个:
UPDATE "BillingData" AS "Target" SET
"user:Product" = "Source"."user:Product"
FROM "BillingData" AS "Source"
WHERE "Source"."ResourceId" = "Target"."ResourceId"
AND "Target"."user:Product" IS NULL
AND "Source"."user:Product" IS NOT NULL
AND "Target"."UsageStartDate" < "Source"."UsageStartDate"
;
请注意,如果您有两个具有相同ResourceId
但不同的非空user:Product
值的源行,那么将使用哪一行作为更新源来进行操作。您应该事先使用如下查询检查源行的唯一性:
SELECT
"ResourceId"
FROM "BillingData"
WHERE "user:Product" IS NOT NULL
GROUP BY "ResourceId"
HAVING COUNT(*) > 1
...或者,在查询中使用它作为过滤谓词以避免问题(但不能完全解决原始问题),如下所示:
UPDATE "BillingData" AS "Target" SET
"user:Product" = "Source"."user:Product"
FROM "BillingData" AS "Source"
WHERE "Source"."ResourceId" = "Target"."ResourceId"
AND "Target"."user:Product" IS NULL
AND "Source"."user:Product" IS NOT NULL
AND "Target"."UsageStartDate" < "Source"."UsageStartDate"
AND NOT EXISTS (
SELECT *
FROM "BillingData"
WHERE "user:Product" IS NOT NULL
AND "ResourceId" = "Source"."ResourceId"
HAVING COUNT(*) > 1
)
;