我有一台运行 Core i5 4210U,4GB RAM 1600MHz,500GB HDD,Ubuntu 18.04 的Lenovo笔记本电脑。
在其中,我正在docker中运行 postgres v10容器。
我是SQL的业余爱好者,而在PostgreSQL中则是全新的。
我的业务逻辑很简单
下面是我解决这个问题的方法。
我创建了以下表格
用户
create table users (
id bigserial primary key,
name text not null,
email text not null,
password text not null,
created_at timestamp with time zone default current_timestamp,
updated_at timestamp with time zone default current_timestamp);
首页
create table stoppages (
id bigserial primary key,
name text not null,
lat double precision not null,
lon double precision not null,
created_at timestamp with time zone default current_timestamp,
updated_at timestamp with time zone default current_timestamp);
路线
create table routes (
id bigserial primary key,
name text not null,
user_id bigint references users(id),
seats smallint not null,
start_at timestamp with time zone not null,
created_at timestamp with time zone default current_timestamp,
updated_at timestamp with time zone default current_timestamp);
路线停靠点地图
create table route_stoppage_map (
route_id bigint references routes(id),
stoppage_id bigint references stoppages(id),
sl_no smallint not null);
仅供参考:sl_no
字段是该路线中停靠点的索引。
我已经使用create extension earthdistance cascade;
在此数据库中安装了cube
和earthdistance
扩展名。
我还在下面的PLPGSQL中编写了一个实用程序功能
create function stp_within(double precision, double precision, double precision)
returns table (id bigint, name text, lat double precision,
lon double precision, created_at timestamp with time zone,
updated_at timestamp with time zone)
as $$
begin
return query select * from stoppages where
earth_distance(ll_to_earth(stoppages.lat, stoppages.lon), ll_to_earth($1, $2)) <= $3;
end;
$$ language plpgsql;
此函数返回距特定地理位置特定半径(以米为单位)的停靠点。
我用来从地理位置 22.449227,88.302977 到 22.599199,88.423370 的路由的查询。我使用的默认半径为 2000米。
我设法写的查询如下
with start_location as (select * from stp_within(22.449227, 88.302977, 2000)),
end_location as (select * from stp_within(22.599199, 88.423370, 2000)),
starting_routes as (select route_id, sl_no from route_stoppage_map where stoppage_id in (select id from start_location)),
ending_routes as (select route_id, sl_no from route_stoppage_map where stoppage_id in (select id from end_location)),
matches as (select distinct starting_routes.route_id from starting_routes inner join ending_routes on
starting_routes.route_id = ending_routes.route_id and starting_routes.sl_no < ending_routes.sl_no),
selected_routes as (select name, user_id from routes where id in (select route_id from matches))
select selected_routes.name as route_name, users.name as user_name from users inner join selected_routes on users.id = selected_routes.user_id;
这使我获得了最低限度的结果。但这还不完整,我似乎无法找出解决以下问题的方法
explain analyze
时,我发现 0.931毫秒的计划时间和 20.728毫秒的执行时间,当时只有2个用户,5条路线和7个停靠点,每条路线只有3个-5停工。如果我错过任何信息,请原谅我。
请帮助我解决上述问题。
编辑:explain(分析,缓冲区)的输出
Result {
command: 'EXPLAIN',
rowCount: null,
oid: null,
rows:
[ { 'QUERY PLAN': 'Hash Join (cost=410.84..415.36 rows=200 width=64) (actual time=2.494..2.499 rows=3 loops=1)' },
{ 'QUERY PLAN': ' Hash Cond: (selected_routes.user_id = users.id)' },
{ 'QUERY PLAN': ' Buffers: shared hit=487' },
{ 'QUERY PLAN': ' CTE start_location' },
{ 'QUERY PLAN': ' -> Function Scan on stp_within (cost=0.25..10.25 rows=1000 width=72) (actual time=1.812..1.813 rows=3 loops=1)' },
{ 'QUERY PLAN': ' Buffers: shared hit=482' },
{ 'QUERY PLAN': ' CTE end_location' },
{ 'QUERY PLAN': ' -> Function Scan on stp_within stp_within_1 (cost=0.25..10.25 rows=1000 width=72) (actual time=0.567..0.568 rows=2 loops=1)' },
{ 'QUERY PLAN': ' Buffers: shared hit=1' },
{ 'QUERY PLAN': ' CTE starting_routes' },
{ 'QUERY PLAN': ' -> Hash Join (cost=27.00..69.19 rows=885 width=10) (actual time=1.835..1.842 rows=9 loops=1)' },
{ 'QUERY PLAN': ' Hash Cond: (route_stoppage_map.stoppage_id = start_location.id)' },
{ 'QUERY PLAN': ' Buffers: shared hit=483' },
{ 'QUERY PLAN': ' -> Seq Scan on route_stoppage_map (cost=0.00..27.70 rows=1770 width=18) (actual time=0.002..0.004 rows=23 loops=1)' },
{ 'QUERY PLAN': ' Buffers: shared hit=1' },
{ 'QUERY PLAN': ' -> Hash (cost=24.50..24.50 rows=200 width=8) (actual time=1.825..1.825 rows=3 loops=1)' },
{ 'QUERY PLAN': ' Buckets: 1024 Batches: 1 Memory Usage: 9kB' },
{ 'QUERY PLAN': ' Buffers: shared hit=482' },
{ 'QUERY PLAN': ' -> HashAggregate (cost=22.50..24.50 rows=200 width=8) (actual time=1.822..1.823 rows=3 loops=1)' },
{ 'QUERY PLAN': ' Group Key: start_location.id' },
{ 'QUERY PLAN': ' Buffers: shared hit=482' },
{ 'QUERY PLAN': ' -> CTE Scan on start_location (cost=0.00..20.00 rows=1000 width=8) (actual time=1.813..1.816 rows=3 loops=1)' },
{ 'QUERY PLAN': ' Buffers: shared hit=482' },
{ 'QUERY PLAN': ' CTE ending_routes' },
{ 'QUERY PLAN': ' -> Hash Join (cost=27.00..69.19 rows=885 width=10) (actual time=0.585..0.590 rows=7 loops=1)' },
{ 'QUERY PLAN': ' Hash Cond: (route_stoppage_map_1.stoppage_id = end_location.id)' },
{ 'QUERY PLAN': ' Buffers: shared hit=2' },
{ 'QUERY PLAN': ' -> Seq Scan on route_stoppage_map route_stoppage_map_1 (cost=0.00..27.70 rows=1770 width=18) (actual time=0.003..0.005 rows=23 loops=1)' },
{ 'QUERY PLAN': ' Buffers: shared hit=1' },
{ 'QUERY PLAN': ' -> Hash (cost=24.50..24.50 rows=200 width=8) (actual time=0.577..0.577 rows=2 loops=1)' },
{ 'QUERY PLAN': ' Buckets: 1024 Batches: 1 Memory Usage: 9kB' },
{ 'QUERY PLAN': ' Buffers: shared hit=1' },
{ 'QUERY PLAN': ' -> HashAggregate (cost=22.50..24.50 rows=200 width=8) (actual time=0.574..0.575 rows=2 loops=1)' },
{ 'QUERY PLAN': ' Group Key: end_location.id' },
{ 'QUERY PLAN': ' Buffers: shared hit=1' },
{ 'QUERY PLAN': ' -> CTE Scan on end_location (cost=0.00..20.00 rows=1000 width=8) (actual time=0.568..0.569 rows=2 loops=1)' },
{ 'QUERY PLAN': ' Buffers: shared hit=1' },
{ 'QUERY PLAN': ' CTE matches' },
{ 'QUERY PLAN': ' -> Unique (cost=122.04..198.25 rows=200 width=8) (actual time=2.451..2.458 rows=3 loops=1)' },
{ 'QUERY PLAN': ' Buffers: shared hit=485' },
{ 'QUERY PLAN': ' -> Merge Join (cost=122.04..194.99 rows=1305 width=8) (actual time=2.450..2.456 rows=5 loops=1)' },
{ 'QUERY PLAN': ' Merge Cond: (starting_routes.route_id = ending_routes.route_id)' },
{ 'QUERY PLAN': ' Join Filter: (starting_routes.sl_no < ending_routes.sl_no)' },
{ 'QUERY PLAN': ' Rows Removed by Join Filter: 7' },
{ 'QUERY PLAN': ' Buffers: shared hit=485' },
{ 'QUERY PLAN': ' -> Sort (cost=61.02..63.23 rows=885 width=10) (actual time=1.852..1.852 rows=9 loops=1)' },
{ 'QUERY PLAN': ' Sort Key: starting_routes.route_id' },
{ 'QUERY PLAN': ' Sort Method: quicksort Memory: 25kB' },
{ 'QUERY PLAN': ' Buffers: shared hit=483' },
{ 'QUERY PLAN': ' -> CTE Scan on starting_routes (cost=0.00..17.70 rows=885 width=10) (actual time=1.836..1.844 rows=9 loops=1)' },
{ 'QUERY PLAN': ' Buffers: shared hit=483' },
{ 'QUERY PLAN': ' -> Sort (cost=61.02..63.23 rows=885 width=10) (actual time=0.596..0.597 rows=10 loops=1)' },
{ 'QUERY PLAN': ' Sort Key: ending_routes.route_id' },
{ 'QUERY PLAN': ' Sort Method: quicksort Memory: 25kB' },
{ 'QUERY PLAN': ' Buffers: shared hit=2' },
{ 'QUERY PLAN': ' -> CTE Scan on ending_routes (cost=0.00..17.70 rows=885 width=10) (actual time=0.586..0.592rows=7 loops=1)' },
{ 'QUERY PLAN': ' Buffers: shared hit=2' },
{ 'QUERY PLAN': ' CTE selected_routes' },
{ 'QUERY PLAN': ' -> Hash Join (cost=9.00..31.32 rows=200 width=40) (actual time=2.483..2.485 rows=3 loops=1)' },
{ 'QUERY PLAN': ' Hash Cond: (routes.id = matches.route_id)' },
{ 'QUERY PLAN': ' Buffers: shared hit=486' },
{ 'QUERY PLAN': ' -> Seq Scan on routes (cost=0.00..18.00 rows=800 width=48) (actual time=0.004..0.004 rows=5 loops=1)' },
{ 'QUERY PLAN': ' Buffers: shared hit=1' },
{ 'QUERY PLAN': ' -> Hash (cost=6.50..6.50 rows=200 width=8) (actual time=2.466..2.466 rows=3 loops=1)' },
{ 'QUERY PLAN': ' Buckets: 1024 Batches: 1 Memory Usage: 9kB' },
{ 'QUERY PLAN': ' Buffers: shared hit=485' },
{ 'QUERY PLAN': ' -> HashAggregate (cost=4.50..6.50 rows=200 width=8) (actual time=2.464..2.465 rows=3 loops=1)' },
{ 'QUERY PLAN': ' Group Key: matches.route_id' },
{ 'QUERY PLAN': ' Buffers: shared hit=485' },
{ 'QUERY PLAN': ' -> CTE Scan on matches (cost=0.00..4.00 rows=200 width=8) (actual time=2.453..2.461 rows=3 loops=1)' },
{ 'QUERY PLAN': ' Buffers: shared hit=485' },
{ 'QUERY PLAN': ' -> CTE Scan on selected_routes (cost=0.00..4.00 rows=200 width=40) (actual time=2.484..2.488 rows=3 loops=1)' },
{ 'QUERY PLAN': ' Buffers: shared hit=486' },
{ 'QUERY PLAN': ' -> Hash (cost=15.50..15.50 rows=550 width=40) (actual time=0.005..0.005 rows=2 loops=1)' },
{ 'QUERY PLAN': ' Buckets: 1024 Batches: 1 Memory Usage: 9kB' },
{ 'QUERY PLAN': ' Buffers: shared hit=1' },
{ 'QUERY PLAN': ' -> Seq Scan on users (cost=0.00..15.50 rows=550 width=40) (actual time=0.004..0.004 rows=2 loops=1)' },
{ 'QUERY PLAN': ' Buffers: shared hit=1' },
{ 'QUERY PLAN': 'Planning time: 0.642 ms' },
{ 'QUERY PLAN': 'Execution time: 3.471 ms' } ],
fields:
[ Field {
name: 'QUERY PLAN',
tableID: 0,
columnID: 0,
dataTypeID: 25,
dataTypeSize: -1,
dataTypeModifier: -1,
format: 'text' } ],
_parsers: [ [Function: noParse] ],
RowCtor: null,
rowAsArray: false,
_getTypeParser: [Function: bound ] }
答案 0 :(得分:1)
您应该内联通用表表达式。您也可以内联该过程。
一旦数据变大,这些“区别”也会对您造成伤害。我宁愿不使用“ SELECT *”-而是逐项列出所需的列,因为这样以后可以轻松编写索引。
这应该更接近您的需求:
select
selected_routes.name as route_name,
users.name as user_name
from users
join (
select
name,
user_id
from routes
where id in (
select starting_routes.route_id
from (
select
route_id,
sl_no
from route_stoppage_map
where stoppage_id in (
select id
from stoppages
where earth_distance(
ll_to_earth(stoppages.lat, stoppages.lon),
ll_to_earth(22.449227, 88.302977)) <= 2000
)
) starting_routes
join (
select route_id, sl_no
from route_stoppage_map
where stoppage_id in (
select id
from stoppages
where earth_distance(
ll_to_earth(stoppages.lat, stoppages.lon),
ll_to_earth(22.599199, 88.423370)) <= 2000
)
) ending_routes
on starting_routes.route_id = ending_routes.route_id
and starting_routes.sl_no < ending_routes.sl_no
)
) selected_routes on users.id = selected_routes.user_id
如果您要测试查询的性能,则如果增加数据量,您还将获得更准确的结果-如果将数据集的大小增加到10-60秒,您的调整查询的尝试将更加富有成效,因为任何一次性操作都会舍入错误(花费在检索/呈现结果,打开/关闭连接等上的时间)。