我正在尝试优化Posygres上的查询。表dgt_ip_minute_stats通常由批处理作业更新。表结构:
\d dgt_ip_minute_stats;
Materialized view "public.dgt_ip_minute_stats"
Column | Type | Modifiers
----------------+-----------------------------+-----------
techid | integer |
timestamp | timestamp without time zone |
t_interval | integer |
network_id | integer |
unique_id | bigint |
modem_sn | integer |
rx_tcp_kbyte | double precision |
tx_tcp_kbyte | double precision |
rx_udp_kbyte | double precision |
tx_udp_kbyte | double precision |
rx_icmp_kbyte | double precision |
tx_icmp_kbyte | double precision |
rx_igmp_kbyte | double precision |
tx_igmp_kbyte | double precision |
rx_http_kbyte | double precision |
tx_http_kbyte | double precision |
rx_other_kbyte | double precision |
tx_other_kbyte | double precision |
Indexes:
"idx_dgt_min_stat_time_ntw_uq" UNIQUE, btree ("timestamp", network_id, unique_id)
"dgt_ip_minute_stats_network_id_idx" btree (network_id)
"dgt_ip_minute_stats_timestamp_idx" btree ("timestamp")
"dgt_ip_minute_stats_unique_id_idx" btree (unique_id)
"dgt_ip_minute_stats_unique_id_network_id_idx" btree (unique_id, network_id)
表格很大,有4975883条记录。 这是查询,我正在尝试优化:
EXPLAIN(ANALYZE)
SELECT (EXTRACT(epoch from date_trunc('day',dgt_ip_minute_stats.timestamp)
) * 1000)::BIGINT as dgt_ip_minute_stats_timestamp,
SUM((tx_udp_kbyte)/(8 * 1024))::FLOAT AS up_udp_kbyte,
SUM((rx_udp_kbyte)/(8 * 1024))::FLOAT AS down_udp_kbyte,
SUM((tx_http_kbyte)/(8 * 1024))::FLOAT AS up_http_kbyte,
SUM((rx_http_kbyte)/(8 * 1024))::FLOAT AS down_http_kbyte,
SUM((tx_tcp_kbyte)/(8 * 1024))::FLOAT AS up_tcp_kbyte,
SUM((rx_tcp_kbyte)/(8 * 1024))::FLOAT AS down_tcp_kbyte
FROM dgt_remote_terminal
INNER JOIN dgt_ip_minute_stats
ON dgt_ip_minute_stats.unique_id = dgt_remote_terminal.netmodemid
AND dgt_ip_minute_stats.network_id = dgt_remote_terminal.networkid
WHERE (terminalid = 117529178)
GROUP BY dgt_ip_minute_stats_timestamp
ORDER BY dgt_ip_minute_stats_timestamp ASC;
和解释结果:
Sort (cost=51353.37..51417.41 rows=25615 width=56) (actual time=28875.682..28875.685 rows=31 loops=1)
Sort Key: (((date_part('epoch'::text, date_trunc('day'::text, dgt_ip_minute_stats."timestamp")) * 1000::double precision))::bigint)
Sort Method: quicksort Memory: 29kB
-> HashAggregate (cost=48965.45..49477.75 rows=25615 width=56) (actual time=28875.572..28875.648 rows=31 loops=1)
-> Nested Loop (cost=198.90..47026.44 rows=59662 width=56) (actual time=160.024..28586.375 rows=86383 loops=1)
-> Seq Scan on dgt_remote_terminal (cost=0.00..6.74 rows=2 width=6) (actual time=8.306..101.832 rows=2 loops=1)
Filter: (terminalid = 117529178)
Rows Removed by Filter: 137
-> Bitmap Heap Scan on dgt_ip_minute_stats (cost=198.90..23134.99 rows=7655 width=68) (actual time=131.149..14014.867 rows=43192 loops=2)
Recheck Cond: ((unique_id = dgt_remote_terminal.netmodemid) AND (network_id = dgt_remote_terminal.networkid))
-> Bitmap Index Scan on dgt_ip_minute_stats_unique_id_network_id_idx (cost=0.00..196.98 rows=7655 width=0) (actual time=56.721..56.721 rows=43192 loops=2)
Index Cond: ((unique_id = dgt_remote_terminal.netmodemid) AND (network_id = dgt_remote_terminal.networkid))
Total runtime: 28878.998 ms
(13 rows)
让我感到困惑的是索引扫描很慢!至少在第一次运行时, 然后查询执行得更好,可能是因为索引被缓存了。 但经过几次运行后,它再次变慢。 我找不到其他方法来优化它。首次运行时索引扫描速度慢的原因是什么?以及几次运行后性能下降的原因?表上的更新可以使索引缓存无效? autovaccum在我的数据库上是ON,所以我想我不应该有太多的t-nuples