从值更新列

时间:2018-02-15 21:57:25

标签: sql greenplum

我想从表中更新表TT中的列col_123,以满足某些条件。

以下是我的代码,我有两个值。但在我的实际代码中,有数千个值行。

    UPDATE TT
    SET col_123 = T2.score
    FROM 
        (values ('1007163',2016,3,80.09), ('1034758',2013,4,68.85)) T2(person_id_t2, id_yr_t2, id_qtr_t2, score)
    WHERE person_id = T2.person_id_t2 AND id_yr = T2.id_yr_t2 AND id_qtr = T2.id_qtr_t2;

但即使有这两行,也需要永远更新表格。我做错了什么?

以下是EXPLAIN ANALYZE的输出:

Update (slice0; segments: 56)  (rows=1 width=3903)
  ->  Hash Join  (cost=0.06..750889.50 rows=1 width=3903)
        Hash Cond: TT.person_id::text = "*VALUES*".column1 AND TT.id_yr = "*VALUES*".column2::numeric AND TT.id_qtr = "*VALUES*".column3
        Rows out:  Avg 1.0 rows x 2 workers.  Max 1 rows (seg29) with 236406 ms to first row, 236407 ms to end, start offset by 370 ms.
        Executor memory:  1K bytes avg, 1K bytes max (seg0).
        Work_mem used:  1K bytes avg, 1K bytes max (seg0). Workfile: (0 spilling, 0 reused)
        (seg29)  Hash chain length 1.0 avg, 1 max, using 2 of 262151 buckets.
        ->  Seq Scan on seamless_health_index  (cost=0.00..466843.92 rows=676299 width=3871)
              Rows out:  Avg 676405.3 rows x 56 workers.  Max 678281 rows (seg27) with 0.524 ms to first row, 243299 ms to end, start offset by 369 ms.
        ->  Hash  (cost=0.03..0.03 rows=1 width=72)
              Rows in:  Avg 2.0 rows x 56 workers.  Max 2 rows (seg0) with 0.080 ms to end, start offset by 375 ms.
              ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=1 width=72)
                    Rows out:  Avg 2.0 rows x 56 workers.  Max 2 rows (seg0) with 0.017 ms to first row, 0.020 ms to end, start offset by 375 ms.
Slice statistics:
  (slice0)    Executor memory: 5769K bytes avg x 56 workers, 5769K bytes max (seg0).  Work_mem: 1K bytes max.
Statement statistics:
  Memory used: 128000K bytes
Settings:  from_collapse_limit=16; join_collapse_limit=16
Total runtime: 308388.391 ms

谢谢!

注意:表TT有大约40,000,000行和1000列,但只有两行和col_123应该更新。

1 个答案:

答案 0 :(得分:-1)

TT (person_id::text, id_yr, id_qtr)上创建索引。

然后可以使用嵌套循环连接,它应该更快地找到一个匹配的行。

您不必在索引中包含所有三列,只有那些连接条件具有选择性的列。