Question

具有OpenVAS（由PostgreSQL支持）实例，该实例在打开“任务”标签时很慢。

以下查询在PostgreSQL中运行22秒。有什么建议可以优化吗？

SELECT id, host,
       iso_time (start_time), iso_time (end_time),
       current_port, max_port, report,
       (SELECT uuid FROM reports WHERE id = report),
       (SELECT uuid FROM hosts
        WHERE id = (SELECT host FROM host_identifiers
                    WHERE source_type = 'Report Host'
                      AND name = 'ip'
                      AND source_id = (SELECT uuid FROM reports
                                       WHERE id = report)
                      AND value = report_hosts.host
                    LIMIT 1)
       )
FROM report_hosts
WHERE report = 702;

计划是

 Index Scan using report_hosts_by_report on report_hosts  (cost=0.42..1975570.99 rows=447 width=38) (actual time=50.042..22979.257 rows=1206 loops=1)
   Index Cond: (report = 702)
   SubPlan 1
     ->  Index Scan using reports_pkey on reports  (cost=0.28..2.49 rows=1 width=37) (actual time=0.004..0.004 rows=1 loops=1206)
           Index Cond: (id = report_hosts.report)
   SubPlan 4
     ->  Index Scan using hosts_pkey on hosts  (cost=4414.37..4416.59 rows=1 width=37) (actual time=0.001..0.001 rows=0 loops=1206)
           Index Cond: (id = $4)
           InitPlan 3 (returns $4)
             ->  Limit  (cost=2.49..4414.09 rows=1 width=4) (actual time=18.998..18.998 rows=0 loops=1206)
                   InitPlan 2 (returns $2)
                     ->  Index Scan using reports_pkey on reports reports_1  (cost=0.28..2.49 rows=1 width=37) (actual time=0.001..0.001 rows=1 loops=1206)
                           Index Cond: (id = report_hosts.report)
                   ->  Seq Scan on host_identifiers  (cost=0.00..4411.60 rows=1 width=4) (actual time=18.997..18.997 rows=0 loops=1206)
                         Filter: ((source_type = 'Report Host'::text) AND (name = 'ip'::text) AND (source_id = $2) AND (value = report_hosts.host))
                         Rows Removed by Filter: 99459
 Planning time: 0.531 ms
 Execution time: 22979.575 ms

Answer 1

所有时间都花在host_identifiers的1206顺序扫描中。

尝试用联接替换子查询：

SELECT rh.id, rh.host,
       iso_time(rh.start_time), iso_time(rh.end_time),
       rh.current_port, rh.max_port, rh.report,
       r.uuid,
       h.uuid
FROM report_hosts AS rh
   LEFT JOIN reports AS r
      ON rh.report = r.id
   LEFT JOIN host_identifiers AS hi
      ON hi.source_id = r.uuid
         AND hi.value = rh.host
         AND hi.source_type = 'Report Host'
         AND hi.name = 'ip'
   LEFT JOIN hosts AS h
      ON h.id = hi.host
WHERE rh.report = 702;

这并不完全相同，因为它没有说明LIMIT 1，而没有ORDER BY的意义不大，但是应该接近事实。

适当的索引将使其快速运行（如果尚不存在）：

reports(id)上的一个
host_identifiers(source_id, value)上的一个
hosts(id)上的一个

您的查询很难阅读，因为您没有用表名来限定列。

Answer 2

哇！添加索引host_identifiers(source_id, value)正是我想要的：

create INDEX host_identifiers_source_id_value on host_identifiers(source_id, value);

“任务”标签的页面加载时间从70s减少到 13s。

谢谢！

PostgreSQL慢查询

2 个答案: