更新postgres表以压缩第二个表中的重复值

时间:2017-01-19 22:17:25

标签: database postgresql

我有一个带有两个表的postgresql架构:

tableA:                      tableB:

| id | username |            | fk_id | resource |
| 1  | user1    |            | 2     | item1    |
| 2  | user1    |            | 1     | item3    | 
| 3  | user1    |            | 1     | item2    |
| 4  | user2    |            | 4     | item5    |
| 5  | user2    |            | 5     | item8    |
| 6  | user3    |            | 3     | item9    |

tableB中的外键fk_id引用tableA中的id

如何更新tableB的所有外键ID以指向tableA中唯一用户名的最低条目?

2 个答案:

答案 0 :(得分:1)

update table_b b
set fk_id = d.id
from table_a a
join (
    select distinct on (username) username, id
    from table_a
    order by 1, 2
    ) d using(username)
where a.id = b.fk_id;

Test it here.

更新中使用的查询提供了actual_id, username, desired_id

select a.id actual_id, username, d.id desired_id
from table_a a
join (
    select distinct on (username) username, id
    from table_a
    order by 1, 2
    ) d using(username)

 actual_id | username | desired_id 
-----------+----------+------------
         1 | user1    |          1
         2 | user1    |          1
         3 | user1    |          1
         4 | user2    |          4
         5 | user2    |          4
         6 | user3    |          6
(6 rows)    

答案 1 :(得分:1)

我们定义你的表:

CREATE TABLE tableA (id, username) AS
SELECT * FROM
(
  VALUES
  (1, 'user1'),
  (2, 'user1'),
  (3, 'user1'),
  (4, 'user2'),
  (5, 'user2'),
  (6, 'user2')
) AS x ;

CREATE TABLE tableB (fk_id, resource) AS
SELECT * FROM 
(
  VALUES
  (2, 'item1'),
  (1, 'item3'),
  (1, 'item2'),
  (4, 'item5'),
  (5, 'item8'),
  (3, 'item9')
) AS x ;

使用该信息,您可以创建(虚拟)转换表,并使用它来更新您的数据:

-- Using tableA, make a new table with the 
-- minimum id for every username
WITH username_to_min_id AS
(
SELECT 
  min(id) AS min_id, username
FROM 
  tableA
GROUP BY 
  username
)

-- Convert the previous table to a id -> min_id 
-- conversion table
, id_to_min_id AS
(
SELECT
  id, min_id
FROM
  tableA
  JOIN username_to_min_id USING(username)
)

-- Use this conversion table to update tableB
UPDATE
  tableB
SET
  fk_id = min_id
FROM
  id_to_min_id
WHERE
  -- JOIN condition with table to update
  id_to_min_id.id = tableB.fk_id 
  -- Take out the ones that won't change
  AND (fk_id <> min_id)
RETURNING
  * ;

你得到的结果是:

+-------+----------+----+--------+
| fk_id | resource | id | min_id |
+-------+----------+----+--------+
|     1 | item1    |  2 |      1 |
|     1 | item9    |  3 |      1 |
|     4 | item8    |  5 |      4 |
+-------+----------+----+--------+

显示已更新了三行,其中fk_id =(2,3,5),现在(1,1,4)。 (id是&#34;旧&#34; fk_id值。

您可以在http://rextester.com/EQPH47434

查看

你可以&#34;挤压一切&#34; [根据其定义更改每个虚拟表名,并执行几个SELECT优化]并获得此等效查询(可能不太清楚,但完全等效):

UPDATE
  tableB
SET
  fk_id = min_id
FROM
  tableA
  JOIN 
  (
    SELECT 
      min(id) AS min_id, username
    FROM 
      tableA
    GROUP BY 
      username
  ) AS username_to_min_id 
  USING (username)
WHERE
  tableA.id = tableB.fk_id 
  AND (fk_id <> min_id)
RETURNING
  * ;