我想从表中更新表TT中的列col_123,以满足某些条件。
以下是我的代码,我有两个值。但在我的实际代码中,有数千个值行。
UPDATE TT
SET col_123 = T2.score
FROM
(values ('1007163',2016,3,80.09), ('1034758',2013,4,68.85)) T2(person_id_t2, id_yr_t2, id_qtr_t2, score)
WHERE person_id = T2.person_id_t2 AND id_yr = T2.id_yr_t2 AND id_qtr = T2.id_qtr_t2;
但即使有这两行,也需要永远更新表格。我做错了什么?
以下是EXPLAIN ANALYZE的输出:
Update (slice0; segments: 56) (rows=1 width=3903)
-> Hash Join (cost=0.06..750889.50 rows=1 width=3903)
Hash Cond: TT.person_id::text = "*VALUES*".column1 AND TT.id_yr = "*VALUES*".column2::numeric AND TT.id_qtr = "*VALUES*".column3
Rows out: Avg 1.0 rows x 2 workers. Max 1 rows (seg29) with 236406 ms to first row, 236407 ms to end, start offset by 370 ms.
Executor memory: 1K bytes avg, 1K bytes max (seg0).
Work_mem used: 1K bytes avg, 1K bytes max (seg0). Workfile: (0 spilling, 0 reused)
(seg29) Hash chain length 1.0 avg, 1 max, using 2 of 262151 buckets.
-> Seq Scan on seamless_health_index (cost=0.00..466843.92 rows=676299 width=3871)
Rows out: Avg 676405.3 rows x 56 workers. Max 678281 rows (seg27) with 0.524 ms to first row, 243299 ms to end, start offset by 369 ms.
-> Hash (cost=0.03..0.03 rows=1 width=72)
Rows in: Avg 2.0 rows x 56 workers. Max 2 rows (seg0) with 0.080 ms to end, start offset by 375 ms.
-> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=1 width=72)
Rows out: Avg 2.0 rows x 56 workers. Max 2 rows (seg0) with 0.017 ms to first row, 0.020 ms to end, start offset by 375 ms.
Slice statistics:
(slice0) Executor memory: 5769K bytes avg x 56 workers, 5769K bytes max (seg0). Work_mem: 1K bytes max.
Statement statistics:
Memory used: 128000K bytes
Settings: from_collapse_limit=16; join_collapse_limit=16
Total runtime: 308388.391 ms
谢谢!
注意:表TT有大约40,000,000行和1000列,但只有两行和col_123应该更新。
答案 0 :(得分:-1)
在TT (person_id::text, id_yr, id_qtr)
上创建索引。
然后可以使用嵌套循环连接,它应该更快地找到一个匹配的行。
您不必在索引中包含所有三列,只有那些连接条件具有选择性的列。