我正在尝试创建星型模式,当前正在处理维表。我想将多个列从一个表复制到另一个表,但同时我想使结果值按列中的1个唯一。
这些是我正在使用的表: DWH_PRICE_PAID_RECORDS
TypeError: 'generator' object is not callable
和DIM_REGION
CREATE TABLE "DWH_PRICE_PAID_RECORDS" ("TRANSACTION_ID" VARCHAR(50) NOT NULL, "PRICE" INTEGER, "DATE_OF_TRANSFER" DATE NOT NULL, "PROPERTY_TYPE" CHAR(1), "OLD_NEW" CHAR(1), "DURATION" CHAR(1), "TOWN_CITY" VARCHAR(50), "DISTRICT" VARCHAR(50), "COUNTY" VARCHAR(50), "PPDCATEGORY_TYPE" CHAR(1), "RECORD_TYPE" CHAR(1));
ALTER TABLE "DWH_PRICE_PAID_RECORDS" ADD CONSTRAINT "PK3" PRIMARY KEY ("TRANSACTION_ID");
我的第一个尝试是使用“选择不同的”,但是这只会删除所有合并的列的所有重复项。我想要一个区域维度,“城镇”应该是标识符,以使DIM_REGION与稍后将在数据集市上创建的事实表(称为DM_PRICE_PAID_RECORDS)相匹配。
DWH_PRICE_PAID_RECORDS表大约有1万条记录,但只有938个唯一城镇。我想将dim_region中的938个城镇与其他列(如县,区等)一起作为ID。
这是可行的方法,但是当然,除镇外,其他所有内容都为NULL:
CREATE TABLE "DIM_REGION" ("REGION_ID" INTEGER generated always as identity (start with 1 increment by 1), "TRANSACTION_ID" VARCHAR(50), "TOWN" VARCHAR(50), "COUNTY" VARCHAR(50), "DISTRICT" VARCHAR(50), "LATITUDE" VARCHAR(50), "LONGITUDE" VARCHAR(50), "COUNTRY_STRING" VARCHAR(50));
ALTER TABLE "DIM_REGION" ADD CONSTRAINT "PK8" PRIMARY KEY ("REGION_ID");
所以我以为我只需要添加其他列
INSERT INTO DIM_REGION (TOWN) SELECT (town_city) from DWH_PRICE_PAID_RECORDS GROUP BY town_city;
但是当我这样做时,我收到此错误消息(该错误消息是德语的,抱歉,我不得不翻译):
INSERT INTO DIM_REGION (TOWN, County, District) SELECT town_city, county, district from DWH_PRICE_PAID_RECORDS GROUP BY town_city;
您能帮我还是有另一个主意,我还能如何获得想要的结果?
非常感谢您!
答案 0 :(得分:1)
你是如此亲密!
INSERT INTO DIM_REGION (TOWN, County, District) SELECT town_city, county, district from DWH_PRICE_PAID_RECORDS GROUP BY town_city, county, district;
那应该做的。使用group by时,SELECT列表中不是聚合的所有内容都必须出现在GROUP BY子句中。
顺便说一句,TRANSACTION_ID确实属于维度表吗?
答案 1 :(得分:1)
如果其他2列无关紧要,则可以执行以下操作:
INSERT INTO DIM_REGION (TOWN, County, District)
SELECT town_city, MAX(county), MAX(district)
FROM DWH_PRICE_PAID_RECORDS
GROUP BY town_city
每个镇只给您1行。