Question

我目前正在尝试创建一个函数，该函数将在数据仓库中的每个模式中的表上创建索引。这是我到目前为止的脚本：

create or replace function dwh.loan_type_id_indexing()
returns void language plpgsql AS
$PROC$
Declare
       myschema varchar;
               sql text;        
Begin 
    for myschema in 
        SELECT nspname
          FROM pg_catalog.pg_namespace 
         where nspname not in ('information_schema', 'pg_catalog', 'pg_temp_1',
                               'pg_temp_7', 'pg_toast', 'pg_toast_temp_1',
                               'pg_toast_temp_7','public', 'c1', 'dwh',
                               'users', 'c2'
                              )
         order by nspname
    loop        
        sql = 'CREATE INDEX '|| myschema || '_' ||'type_id ON '|| 
        myschema || '.' ||'.fact_tbl USING btree (loan_type_id)';

        execute sql;

    end loop;
END
$PROC$
volatile;

我知道这不是正确的，但它会让我思考我想要做的事情。

Answer 1

不是过滤出模式并假设每个模式都有你想要的表，而是查询它们information_schema并循环结果列表：

select t.table_schema
from information_schema.tables t inner join information_schema.columns c 
  on (t.table_schema = c.table_schema and t.table_name = c.table_name) 
where t.table_name = 'fact_loan' and c.column_name = 'loan_type_id'
  and t.table_schema NOT LIKE 'pg_%'
  and t.table_schema NOT IN ('information_schema', 'ad_delivery', 'dwh', 'users', 'wand');

通过循环查询返回的记录，您现在拥有了使用EXECUTE创建索引所需的一切。

您可能希望RAISE NOTICE 'Creating index on %s.fact_loan.loan_type_id', table_schema;也允许您跟踪进度，因为索引构建可能需要一段时间。

如果您要过滤模式，那么最好使用schemaname NOT LIKE 'pg_%' AND lower(shemaname) <> 'information_schema'，如上所示。

顺便说一句，我通常发现这种工作从数据库外部的脚本更方便，我可以访问多个连接，线程/多处理等。一个带有psycopg2驱动程序的快速Python脚本用于Pg会让你把这样的东西组合在一起，这样就可以同时建立4个并行的索引;正确的数字取决于您的磁盘配置。

使用模式变量创建索引的函数

1 个答案: