Question

我想用以前未知的值对Postgres中的表进行分区。在我的场景中，值将是device_id，这是一个字符串。

这是目前的情况：

表'device_data' - 存储从DDL定义的设备发送的传感器数据：

CREATE TABLE warehouse.device_data (
  id INTEGER PRIMARY KEY NOT NULL DEFAULT nextval('device_data_id_seq'::regclass),
  device_id TEXT NOT NULL,
  device_data BYTEA NOT NULL,
--   contains additional fields which are omitted for brevity
  received_at TIMESTAMP WITHOUT TIME ZONE DEFAULT now()
);

表目前拥有数百万条记录，查询耗费大量时间。大多数查询都包含WHERE device_id='something'子句。

我想到的解决方案是为每个device_id创建表分区。

Postgres是否可以为每个device_id创建表分区？

我查看了Postgres文档以及我发现的几个示例，但是所有这些示例都使用固定边界来创建分区。我的解决方案需要：

当新的device_id成为第一个时，动态创建新的表分区遇到

device_id

存储到现有分区已知且device_id的分区已存在

我希望使用表分区来完成此操作，因为它允许查询多个device_id。

Answer 1

我喜欢动态分区的想法。我不知道它会如何影响性能，因为我从未使用它。

将cat:dog的类型更改为id并手动创建序列，以避免对单个插入内容进行多次int default 0次调用：

nextval()

在触发器功能中使用动态sql：

create table device_data (
    id int primary key default 0,
    device_id text not null,
    device_data text not null, -- changed for tests
    received_at timestamp without time zone default now()
);
create sequence device_data_seq owned by device_data.id;

测试：

create or replace function before_insert_on_device_data()
returns trigger language plpgsql as $$
begin
    execute format(
        $f$
            create table if not exists %I (
            check (device_id = %L)
            ) inherits (device_data)
        $f$, 
        concat('device_data_', new.device_id), 
        new.device_id);
    execute format(
        $f$
            insert into %I
            values (nextval('device_data_seq'), %L, %L, default)
        $f$, 
        concat('device_data_', new.device_id), 
        new.device_id, 
        new.device_data);
    return null;
end $$;

create trigger before_insert_on_device_data
    before insert on device_data
    for each row execute procedure before_insert_on_device_data();

Postgres中按字符串标识符进行动态表分区

1 个答案: