我正在创建一个for循环函数,以更新根据当前行和上一行的值计算出的多列(时间,距离和速度),并删除更新后的列(速度)中的值超过临界值的行。 。该示例表具有约50万条记录,但是需要花费数小时才能执行,但仍未完成。索引,work_mem,fillfactor和vacuum full不会进行重大更改。下面是我想出的功能。
create or replace function speed_cal_cutoff()
returns void as
$body$
declare
t_curs cursor for
select "id", loggerid, datecon, timecon,
time_interval, gcs_distance, gcs_geom,
interval_seconds, calculated_speed from mytable;
begin
for t_row in t_curs
loop
update mytable
set time_interval = (concat(datecon|| ' ' ||timecon)::timestamp) - prev_datetime
from (select "id",
lag(loggerid) over (partition by loggerid order by datecon, timecon) as prev_loggerid,
lag(concat(datecon|| ' ' ||timecon)::timestamp) over (partition by loggerid order by loggerid, datecon, timecon) as prev_datetime
from mytable) as subquery
where mytable."id" = subquery.id
and mytable.loggerid = subquery.prev_loggerid
and mytable."id" = t_row.id;
update mytable
set gcs_distance = subquery.gcs_distance
from (select "id", ST_Distance(gcs_geom::geography, lag(gcs_geom::geography) over (partition by loggerid order by loggerid, datecon, timecon asc)) as gcs_distance,
lag(loggerid) over (partition by loggerid order by datecon, timecon) as prev_loggerid
from mytable) as subquery
where mytable."id" = subquery.id
and mytable.loggerid = subquery.prev_loggerid
and mytable."id" = t_row.id;
update mytable
set interval_seconds = (extract(EPOCH from time_interval))
where mytable."id" = t_row.id;
update mytable
set calculated_speed = gcs_distance/interval_seconds
where mytable."id" = t_row.id;
delete from mytable where calculated_speed > 41.6667
and mytable."id" = t_row.id;
end loop;
end
$body$
language plpgsql;
如何优化代码以获得更好的性能?
答案 0 :(得分:0)
Postgres不喜欢在一次交易中重复进行大量更新-主要是如果相同的值被更新多次。原因是Postgres如何实现MVCC体系结构。你能做什么
a)尝试通过使用数组来减少重复更新的次数。数组只是在内存结构中-如果您拥有Postgres 10和更高版本,则数组更新很便宜。
b)尝试缩小交易规模。如果您可以将事务分解为一组较小的事务,则有可能有效清理表堆,并且可以进行更改以大大提高执行速度。
# bad technique, pseudocode
begin;
for i in 1 .. 100 loop
for j in 1 .. 1000 loop -- any value will be updated 100x without cleaning
update tab set v = j + i where pk = j;
end loop;
end loop;
commit;
# better
for i in 1 .. 100 loop
begin;
for j in 1 .. 1000 loop -- any value will be updated 100x without cleaning
update tab set v = j + i where pk = j;
end loop;
commit;
end loop;