如何将一个表中的几列插入到只有一列唯一/不同的另一个表?

时间:2019-06-03 16:39:22

标签: sql db2 derby

我正在尝试创建星型模式,当前正在处理维表。我想将多个列从一个表复制到另一个表,但同时我想使结果值按列中的1个唯一。

这些是我正在使用的表: DWH_PRICE_PAID_RECORDS

TypeError: 'generator' object is not callable

和DIM_REGION

CREATE TABLE "DWH_PRICE_PAID_RECORDS" ("TRANSACTION_ID" VARCHAR(50) NOT NULL, "PRICE" INTEGER, "DATE_OF_TRANSFER" DATE NOT NULL, "PROPERTY_TYPE" CHAR(1), "OLD_NEW" CHAR(1), "DURATION" CHAR(1), "TOWN_CITY" VARCHAR(50), "DISTRICT" VARCHAR(50), "COUNTY" VARCHAR(50), "PPDCATEGORY_TYPE" CHAR(1), "RECORD_TYPE" CHAR(1));

 ALTER TABLE "DWH_PRICE_PAID_RECORDS" ADD CONSTRAINT "PK3" PRIMARY KEY ("TRANSACTION_ID");

我的第一个尝试是使用“选择不同的”,但是这只会删除所有合并的列的所有重复项。我想要一个区域维度,“城镇”应该是标识符,以使DIM_REGION与稍后将在数据集市上创建的事实表(称为DM_PRICE_PAID_RECORDS)相匹配。

DWH_PRICE_PAID_RECORDS表大约有1万条记录,但只有938个唯一城镇。我想将dim_region中的938个城镇与其他列(如县,区等)一起作为ID。

这是可行的方法,但是当然,除镇外,其他所有内容都为NULL:

CREATE TABLE "DIM_REGION" ("REGION_ID" INTEGER generated always as identity (start with 1 increment by 1), "TRANSACTION_ID" VARCHAR(50), "TOWN" VARCHAR(50), "COUNTY" VARCHAR(50), "DISTRICT" VARCHAR(50), "LATITUDE" VARCHAR(50), "LONGITUDE" VARCHAR(50), "COUNTRY_STRING" VARCHAR(50));

ALTER TABLE "DIM_REGION" ADD CONSTRAINT "PK8" PRIMARY KEY ("REGION_ID");

所以我以为我只需要添加其他列

INSERT INTO DIM_REGION (TOWN) SELECT (town_city) from DWH_PRICE_PAID_RECORDS GROUP BY town_city;

但是当我这样做时,我收到此错误消息(该错误消息是德语的,抱歉,我不得不翻译):

INSERT INTO DIM_REGION (TOWN, County, District) SELECT town_city, county, district from DWH_PRICE_PAID_RECORDS GROUP BY town_city;

您能帮我还是有另一个主意,我还能如何获得想要的结果?

非常感谢您!

2 个答案:

答案 0 :(得分:1)

你是如此亲密!

INSERT INTO DIM_REGION (TOWN, County, District) SELECT town_city, county, district from DWH_PRICE_PAID_RECORDS GROUP BY town_city, county, district;

那应该做的。使用group by时,SELECT列表中不是聚合的所有内容都必须出现在GROUP BY子句中。

顺便说一句,TRANSACTION_ID确实属于维度表吗?

答案 1 :(得分:1)

如果其他2列无关紧要,则可以执行以下操作:

INSERT INTO DIM_REGION (TOWN, County, District) 
SELECT town_city, MAX(county), MAX(district) 
FROM DWH_PRICE_PAID_RECORDS 
GROUP BY town_city

每个镇只给您1行。