Question

我有一个postgresql表，用于存储类似表格的数据。

id SERIAL,
item_id INTEGER ,
date BIGINT,
column_id INTEGER,
row_id INTEGER,
value TEXT,
some_flags INTEGER,

问题是我们每天有5000多个条目，信息需要保存多年。所以我最终得到一个巨大的桌子女巫忙着前1000-5000行，有很多SELECT，UPDATE，DELETE查询，但旧内容很少使用（仅在统计中），几乎不会更改。

问题是如何提高日常工作的性能（5000万条中的前5000个条目）。几乎所有列都有简单的索引..但没什么特别的。现在不可能拆分表，我正在寻找更多索引优化。

Answer 1

dezso和Jack的评论中的建议很好。如果您想要最简单，那么这就是您实现部分索引的方式：

create table t ("date" bigint, archive boolean default false);

insert into t ("date")
select generate_series(
    extract(epoch from current_timestamp - interval '5 year')::bigint,
    extract(epoch from current_timestamp)::bigint,
    5)
;

create index the_date_partial_index on t ("date")
where not archive
;

为避免必须更改所有查询，添加索引条件重命名表：

alter table t rename to t_table;

使用旧名称创建视图，包括索引条件：

create view t as
select *
from t_table
where not archive
;

explain
select *
from t
;
                                          QUERY PLAN                                           
-----------------------------------------------------------------------------------------------
 Index Scan using the_date_partial_index on t_table  (cost=0.00..385514.41 rows=86559 width=9)

然后每天归档旧行：

update t_table
set archive = true
where
    "date" < extract(epoch from current_timestamp - interval '1 week')
    and
    not archive
;

not archive条件是避免更新数百万已存档的行。

忙表性能优化

1 个答案: