如何链接PostrgreSQL中的重复记录?我找到了他们:
SELECT * FROM (
SELECT id, import_id, name,
ROW_NUMBER() OVER(PARTITION BY address ORDER BY name asc) AS Row
FROM companies
) dups
where
dups.Row > 1 ORDER BY dups.name;
请参阅http://sqlfiddle.com/#!15/af016/7/1
上的示例代码和演示我想在名为linked_id
的公司中添加一个列,该列将设置为每组重复记录中第一个的import_id
。
答案 0 :(得分:1)
尝试:
UPDATE companies c
SET import_id = q.import_id
FROM (
SELECT id,
FIRST_VALUE(import_id)
OVER(PARTITION BY name, address ORDER BY name asc) AS import_id,
ROW_NUMBER()
OVER(PARTITION BY name, address ORDER BY name asc) AS Rn
FROM companies
) q
WHERE c.id = q.id AND q.rn > 1
;
答案 1 :(得分:1)
这会将parent_id设置为要匹配的第一家公司的import_id。
UPDATE companies
SET parent_id=rs.parent_id FROM
(SELECT id, first_value(import_id)
OVER (PARTITION BY address ORDER BY name) as parent_id
FROM companies
) AS rs
WHERE rs.id=companies.id;