我有一列地址和邮政编码与同一个房屋ID不一致,我想根据公共房屋ID用最常用的地址替换错误的地址。
例如,原始表格可能如下所示,我希望将每个前提的街道和邮政编码列保持一致。
Date | Premise | House_No | Street | Zip
-----------------------------------------------------------
Jan | 43219 | 123 | E Haywood Dr | 31214
Feb | 43219 | 123 | Haywood Dr E | 31214-3291
Mar | 43219 | 123 | E Haywood Dr | 31214
Apr | 43219 | 123 | Haywood Dr E | 31214-3291
May | 43219 | 123 | E Haywood Dr | 31214
Jan | 43111 | 456 | W Simpson Wy | 31202
Feb | 43111 | 456 | W Simpson Wy | 31202
Mar | 43111 | 456 | W Simpson Wy | 31202
Apr | 43111 | 456 | Simpson Wy W | 31202-1022
May | 43111 | 456 | W Simpson Wy | 31202
答案 0 :(得分:0)
尝试使用可更新的CTE:
DECLARE @tbl TABLE (Mnth VARCHAR(100),Premise INT, House_No INT,Street VARCHAR(100),Zip VARCHAR(100));
INSERT INTO @tbl VALUES
('Jan',43219,123,'E Haywood Dr','31214')
,('Feb',43219,123,'Haywood Dr E','31214-3291')
,('Mar',43219,123,'E Haywood Dr','31214')
,('Apr',43219,123,'Haywood Dr E','31214-3291')
,('May',43219,123,'E Haywood Dr','31214')
,('Jan',43111,456,'W Simpson Wy','31202')
,('Feb',43111,456,'W Simpson Wy','31202')
,('Mar',43111,456,'W Simpson Wy','31202')
,('Apr',43111,456,'Simpson Wy W','31202-1022')
,('May',43111,456,'W Simpson Wy','31202');
- 第一个CTE只进行分组计数:
WITH Counted AS
(
SELECT COUNT(Premise) AS [Counter]
,Premise
,House_No
,Street
,Zip
FROM @tbl
GROUP BY Premise,House_No,Street,Zip
)
- 第二个CTE找到计数最高的行
- 注意:如果有多个具有相同计数的选项,那么选择是相当随机的......
,MostCommon AS
(
SELECT *
,ROW_NUMBER() OVER(PARTITION BY Premise ORDER BY [Counter] DESC) AS MaxCounter
FROM Counted
)
- 此CTE是可更新的:您收集实际表数据和新值
,UpdateableCTE AS
(
SELECT tbl.*
,mc.House_No AS NewHouse_No
,mc.Street AS NewStreet
,mc.Zip AS NewZip
FROM @tbl AS tbl
INNER JOIN MostCommon AS mc ON mc.MaxCounter=1 AND mc.Premise=tbl.Premise
)
- 最后设置新值
UPDATE UpdateableCTE SET House_No=NewHouse_No
,Street=NewStreet
,Zip=NewZip;
- 显示结果
SELECT * FROM @tbl;