Postgres PL / pgSQL整合各种表中存在的列

时间:2017-11-26 05:42:18

标签: sql postgresql plpgsql

我正在实施一个工具来清理名为stage的模式中各个表的所有客户名称。客户名称可能来自列billing_acc_namecust_acc_names。我事先并不知道有多少个表有这些列,但只要它们这样做,它们就会成为清理的一部分。

但是,在清理之前,我需要在架构中的表中选择所有唯一的客户名称。

为了更好地分离关注点,我正在考虑在PL / pgSQL中实现它。目前,这就是我在Python / pandas / SQLAlchemy等中实现它的方式。

table_name = 'information_schema.columns'
table_schema_src = 'stage'
cols = ['billing_acc_name', 'cust_acc_name']

# get list of all table names and column names to query in stage schema
sql = text(f""" 
    SELECT table_name, column_name FROM {table_name} WHERE table_schema ='{table_schema_src}'
    AND column_name = ANY(ARRAY{cols})
""")
src = pd.read_sql(sql, con=engine)

# explore implementation in pgsql
# establish query string
cnames = []
for i, row in src.iterrows():
    s = text(f"""
        SELECT DISTINCT upper({row['column_name']}) AS cname FROM stage.{row['table_name']}
    """)
    cnames.append(str(s).strip())

sql = ' UNION '.join(cnames)

df = pd.read_sql(sql, con=engine)

自动生成的SQL查询字符串如下:

SELECT DISTINCT upper(cust_acc_name) AS cname FROM stage.journal_2017_companyA UNION 
SELECT DISTINCT upper(billing_acc_name) AS cname FROM stage.journal_2017_companyA UNION 
SELECT DISTINCT upper(cust_acc_name) AS cname FROM stage.journal_2017_companyB UNION 
SELECT DISTINCT upper(billing_acc_name) AS cname FROM stage.journal_2017_companyB UNION 
SELECT DISTINCT upper(cust_acc_name) AS cname FROM stage.journal_2017_companyC UNION 
SELECT DISTINCT upper(billing_acc_name) AS cname FROM stage.journal_2017_companyC UNION 
SELECT DISTINCT upper(cust_acc_name) AS cname FROM stage.journal_2017_companyD UNION 
SELECT DISTINCT upper(billing_acc_name) AS cname FROM stage.journal_2017_companyD

1 个答案:

答案 0 :(得分:1)

plpgsql函数可能如下所示:

create or replace function select_acc_names(_schema text)
returns setof text language plpgsql as $$
declare
    rec record;
begin
    for rec in
        select table_name, column_name
        from information_schema.columns
        where table_schema = _schema
        and column_name = any(array['cust_acc_name', 'billing_acc_name'])
    loop
        return query
        execute format ($fmt$
            select upper(%I) as cname
            from %I.%I
            $fmt$, rec.column_name, _schema, rec.table_name);
    end loop;
end $$;

使用:

select *
from select_acc_names('stage');