将大量行导入Postgres,并发生重复的密钥冲突

时间:2017-05-01 19:31:11

标签: sql postgresql import key postgresql-copy

Postgres 9.6是否可以使用on duplicate key命令获取COPY UPSERT功能?我有一个CSV文件,我将其导入Postgres,但它可能包含一些重复的密钥违规,因此COPY命令会出错并在遇到它时终止。

文件非常大,因此可能无法在应用程序代码中预处理它(为了处理可能导致重复键冲突的行),因为所有键可能都不适合内存。

将大量行导入Postgres可能包含重复密钥违规的最佳方法是什么?

2 个答案:

答案 0 :(得分:1)

样品:

t=# create table s90(i int primary key, t text);
CREATE TABLE
t=# insert into s90 select 1,'a';
INSERT 0 1
t=# copy s90 from stdin delimiter ',';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 1,'b'
>> 2,'c'
>> \.
ERROR:  duplicate key value violates unique constraint "s90_pkey"
DETAIL:  Key (i)=(1) already exists.
CONTEXT:  COPY s90, line 1

复制的解决方法:

t=# create table s91 as select * from s90 where false;;
SELECT 0
t=# copy s91 from stdin delimiter ',';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 1,'b'
>> 2,'c'
>> \.
COPY 2
t=# with p as (select s91.* from s91 left outer join s90 on s90.i=s91.i where s90.i is null)
insert into s90 select * from p;
INSERT 0 1
t=# select * from s90;
 i |  t
---+-----
 1 | a
 2 | 'c'
(2 rows)

答案 1 :(得分:0)

使用扩展名file_fdw,您可以打开文件并像表一样查询。

Read more in the documentation.

示例:

create extension if not exists file_fdw;

create server csv_server foreign data wrapper file_fdw;

create foreign table my_csv_file (
    id integer,
    should_be_unique_id integer,
    some_other_columns text
) server csv_server
options (filename '/data/my_large_file.csv', format 'csv');

insert into my_new_table
select distinct on (should_be_unique_id) *
from my_csv_file
order by should_be_unique_id, id desc;

或者,如果my_new_table不为空,则可以使用

insert into my_new_table
select * 
from my_csv_file
on conflict ... update ...