Question

我的PostgreSQL 9.5数据库中有一个包含两列的表，即start_time（没有时区的时间戳）和像这样的值（记录）。

Start_time              Values
2003-06-07 00:00:00     12
2004-02-03 00:00:00     16
2005-07-09 00:00:00     14
2003-07-07 00:00:00     17
2004-01-31 00:00:00     11
2005-05-02 00:00:00     10

对于start_time，我需要导出my_table记录，以便为每年的切片生成CSV文件（在单独的CSV文件中分隔每年的记录）。

预期产出：

results_2003.csv
results_2004.csv
results_2005.csv
and so on...

怎么做？

Answer 1

在copy内的动态execute format中使用plpgsql DO block,命令，例如：

do $$
declare
    y int;
begin
    for y in
        select distinct extract(year from start_time)
        from my_table
    loop
        execute format($ex$
            copy (
                select * 
                from my_table 
                where extract(year from start_time) = %1$s
                )
            to '\data\%1$s.csv'
            $ex$, y);
    end loop;
end $$;

Answer 2

在几种可能的替代方法中，我会使用execsql.py（https://pypi.python.org/pypi/execsql/ - 免责声明：我写的）和这个脚本：

select distinct 
    extract(year from start_time) as start_year,
    False as exported
into temporary table tt_years
from interval_table;

create temporary view unexported as
select * from tt_years
where exported = False
limit 1;

-- !x! begin script export_year
-- !x! select_sub unexported
-- !x! if(sub_defined(@start_year))
    create temporary view export_data as
    select * from interval_table
    where extract(year from start_time) = !!@start_year!!;
    -- !x! export export_data to results_!!@start_year!!.csv as csv
    update tt_years
    set exported = True
    where start_year = !!@start_year!!;
    -- !x! execute script export_year
-- !x! endif
-- !x! end script

-- !x! execute script export_year

！x！标记识别execql的metacommands，它允许循环（通过结束递归）并导出为CSV。

PostgreSQL：如何将表记录拆分/导出为年度切片（CSV）？

2 个答案: