我试图找出如何缩短此查询的时间。有人告诉我EXPLAIN ANALYZE
,但我不知道如何解释结果以及要做出哪些修正。有什么建议?请注意,我使用的是第三方数据库(cartoDB),所以我不认为我可以选择创建索引。
这是查询。这里的两个表大约有40行,大约有32,000行。
EXPLAIN ANALYZE SELECT
id, identifier,
CASE
WHEN dist < 8046. THEN 1
WHEN dist < 16093. THEN 2
WHEN dist < 40233. THEN 3
WHEN dist < 80467. THEN 4
WHEN dist < 160934. THEN 5
ELSE 6
END AS grp,
count(*)
FROM (
SELECT s.id, s.identifier, ST_Distance_Sphere(s.the_geom, c.the_geom) AS dist
FROM full_data_for_testing_deid_2 c, demo_locations_table s) AS loc_dist
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3
以下是对EXECUTE ANALYZE
:
{
"fields" : {
"QUERY PLAN" : {
"type" : "string"
}
},
"rows" : [
{
"QUERY PLAN" : "GroupAggregate (cost=373146.40..651612.12 rows=1058805 width=128) (actual time=34120.054..37536.893 rows=197 loops=1)"
},
{
"QUERY PLAN" : " -> Sort (cost=373146.40..373675.81 rows=1058805 width=128) (actual time=34120.000..36504.439 rows=1058805 loops=1)"
},
{
"QUERY PLAN" : " Sort Key: s.id, s.identifier, (CASE WHEN (_st_distance(geography(s.the_geom), geography(c.the_geom), 0::double precision, false) < 8046::double precision) THEN 1 WHEN (_st_distance(geography(s.the_geom), geography(c.the_geom), 0::double precision, false) < 16093::double precision) THEN 2 WHEN (_st_distance(geography(s.the_geom), geography(c.the_geom), 0::double precision, false) < 40233::double precision) THEN 3 WHEN (_st_distance(geography(s.the_geom), geography(c.the_geom), 0::double precision, false) < 80467::double precision) THEN 4 WHEN (_st_distance(geography(s.the_geom), geography(c.the_geom), 0::double precision, false) < 160934::double precision) THEN 5 ELSE 6 END)"
},
{
"QUERY PLAN" : " Sort Method: external merge Disk: 35200kB"
},
{
"QUERY PLAN" : " -> Nested Loop (cost=0.00..283194.48 rows=1058805 width=128) (actual time=0.688..13487.097 rows=1058805 loops=1)"
},
{
"QUERY PLAN" : " -> Seq Scan on full_data_for_testing_deid_2 c (cost=0.00..6845.26 rows=32085 width=32) (actual time=0.006..130.054 rows=32085 loops=1)"
},
{
"QUERY PLAN" : " -> Materialize (cost=0.00..1.13 rows=33 width=96) (actual time=0.001..0.028 rows=33 loops=32085)"
},
{
"QUERY PLAN" : " -> Seq Scan on demo_locations_table s (cost=0.00..1.10 rows=33 width=96) (actual time=0.003..0.034 rows=33 loops=1)"
},
{
"QUERY PLAN" : "Total runtime: 37569.205 ms"
}
],
"time" : 37.574,
"total_rows" : 9
}
答案 0 :(得分:0)
问题出现在笛卡尔积中: SELECT s.id,s.identifier,ST_Distance_Sphere(s.the_geom,c.the_geom)AS dist FROM full_data_for_testing_deid_2 c,demo_locations_table s
以下是嵌套循环。 我不认为你想在这里做笛卡儿。 你可以通过更具体的JOIN ON轻松切断一些不必要的循环。 两点之间的距离是可交换函数。 所以只需添加以下条件:c.pk&gt; s.pk取决于您的需求(没有关于架构设计的信息)