我的问题与此问题基本相同,但我找不到答案,也写了“将在下一个版本中解决”和“易于最小/最大扫描”
PostgreSQL+table partitioning: inefficient max() and min()
CREATE TABLE mc_handstats
(
id integer NOT NULL DEFAULT nextval('mc_handst_id_seq'::regclass),
playerid integer NOT NULL,
CONSTRAINT mc_handst_pkey PRIMARY KEY (id),
);
表格通过playerid分区。
CREATE TABLE mc_handst_0000 ( CHECK ( playerid >= 0 AND playerid < 10000) ) INHERITS (mc_handst) TABLESPACE ssd01;
CREATE TABLE mc_handst_0010 ( CHECK ( playerid >= 10000 AND playerid < 30000) ) INHERITS (mc_handst) TABLESPACE ssd02;
CREATE TABLE mc_handst_0030 ( CHECK ( playerid >= 30000 AND playerid < 50000) ) INHERITS (mc_handst) TABLESPACE ssd03;
...
CREATE INDEX mc_handst_0000_PlayerID ON mc_handst_0000 (playerid);
CREATE INDEX mc_handst_0010_PlayerID ON mc_handst_0010 (playerid);
CREATE INDEX mc_handst_0030_PlayerID ON mc_handst_0030 (playerid);
...
plus create trigger on playerID
我想得到最后一个id(我也可以得到序列的值,但我习惯使用表/ colums),但pSQL似乎是相当愚蠢的扫描表:
EXPLAIN ANALYZE从mc_handstats中选择max(id); (真正的查询永远运行)
"Aggregate (cost=9080859.04..9080859.05 rows=1 width=4) (actual time=181867.626..181867.626 rows=1 loops=1)"
" -> Append (cost=0.00..8704322.43 rows=150614644 width=4) (actual time=2.460..163638.343 rows=151134891 loops=1)"
" -> Seq Scan on mc_handstats (cost=0.00..0.00 rows=1 width=4) (actual time=0.002..0.002 rows=0 loops=1)"
" -> Seq Scan on mc_handst_0000 mc_handstats (cost=0.00..728523.69 rows=12580969 width=4) (actual time=2.457..10800.539 rows=12656647 loops=1)"
...
ALL TABLES
...
"Total runtime: 181867.819 ms"
EXPLAIN ANALYZE从mc_handst_1000
中选择max(id)"Aggregate (cost=83999.50..83999.51 rows=1 width=4) (actual time=1917.933..1917.933 rows=1 loops=1)"
" -> Seq Scan on mc_handst_1000 (cost=0.00..80507.40 rows=1396840 width=4) (actual time=0.007..1728.268 rows=1396717 loops=1)"
"Total runtime: 1918.494 ms"
分区表的运行时为“snap”,完全脱离主表上的记录。 (postgreSQL 9.2)
\ d mc_handstats(只有索引)
Indexes:
"mc_handst_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
"mc_handst_playerid_fkey" FOREIGN KEY (playerid) REFERENCES mc_players(id)
Triggers:
mc_handst_insert_trigger BEFORE INSERT ON mc_handstats FOR EACH ROW EXECUTE PROCEDURE mc_handst_insert_function()
Number of child tables: 20 (Use \d+ to list them.)
\ d mc_handst_1000
Indexes:
"mc_handst_1000_playerid" btree (playerid)
Check constraints:
"mc_handst_1000_playerid_check" CHECK (playerid >= 1000000 AND playerid < 1100000)
hm,子表中没有PK索引。虽然我不明白为什么max(id)的结果在子表上相当快(因为没有索引)而且从主表缓慢,似乎我需要为所有子表添加PK的索引。也许这解决了它。
CREATE INDEX mc_handst_0010_ID ON mc_handst_0010 (id);
... plus many more ...
一切都很好。仍然很奇怪为什么它之前在子表上工作得很快,这让我觉得它们已被编入索引,但我也不在乎。
谢谢你!
答案 0 :(得分:0)
您需要做的第一件事是索引(id)上的所有子表,并查看max(id)是否足够智能以在每个表上执行索引扫描。我想我应该是,但我不完全确定。
如果没有,这就是我要做的事情:我会从currval([sequence_name])
开始,然后继续工作直到找到记录。你可以做一些事情,一次检查10个块,或者基本上是一个稀疏扫描。这可以通过像这样的CTE来完成(再次依赖于索引):
WITH RECURSIVE ids (
select max(id) as max_id, currval('mc_handst_id_seq') - 10 as min_block
FROM mc_handst
WHERE id BETWEEN currval('mc_handst_id_seq') - 10 AND currval('mc_handst_id_seq')
UNION ALL
SELECT max(id), i.min_block - 10
FROM mc_handst
JOIN ids i ON id BETWEEN i.min_block - 10 AND i.min_block
WHERE i.max_id IS NULL
)
SELECT max(max_id) from ids;
如果分区编入索引后规划器不会使用索引,那么应该进行稀疏扫描。在大多数情况下,它应该只进行一次扫描,但会根据需要重复查找ID。请注意,它可能永远在空表上运行。
答案 1 :(得分:0)
假设父母的表是这样的:
CREATE TABLE parent AS (
id not null default nextval('parent_id_seq'::regclass)
... other columns ...
);
您是否正在使用规则或触发器将INSERT转移到子表中,您可以在INSERT之后立即使用:
SELECT currval('parent_id_seq'::regclass);
获取会话插入的最后一个id,与并发INSERT无关,每个会话都有自己的最后一个序列值的副本。
https://dba.stackexchange.com/questions/58497/return-id-from-partitioned-table-in-postgres