我可以使用copy_expert / copy_from / pandas来对报价封装的csv进行空处理吗?

时间:2019-05-06 03:34:44

标签: python postgresql csv

我正在构建一个工具,将csv文件输入到postgres数据库中。但是,我无法处理空值。该错误是由于源csv文件中的int数据类型字段为空。如果可能,我想使用python处理此操作,避免对csv提取进行任何更改。简化的csv格式如下:

我的字段和格式: Field1(Int),Field2(Varchar)

csv快照示例:

“ 1”,“ abc ,, sdas”“ ds,dsd,a” sdasdasda“

“”,“ asdasd,”“ ,,”“ <” <“ //”

我已经看过copy_from和copy_expert选项。但是,copy_from不允许使用引号封装字段,而copy_expert没有空处理。我也尝试用pandas替换空值,但是pandas也不解析带有多引号的字段。

#pandas fail
import pandas as pd
flights = pd.read_csv('sample.csv',sep=',\s*',skipinitialspace=True,quoting=csv.QUOTE_ALL,engine='python')
flights.shape

ParserError: Expected 8 fields in line 6, saw 9. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.

import pandas as pd
flights = pd.read_csv('sample.csv',sep=',')
flights.shape

ParserError: Error tokenizing data. C error: Expected 8 fields in line 6, saw 9



#copy_expert fail
import psycopg2
conn = psycopg2.connect(user = "user",
                        password = "password",
                        host = "1.1.1.1",
                        port = "1111",
                        database = "Test_1")
cur = conn.cursor()
with open('sample.csv', 'r') as f:

    cur.copy_expert("""COPY abcd FROM STDIN WITH (FORMAT CSV)""", f)

conn.commit()

DataError: invalid input syntax for integer: ""




#copy_from fail
import psycopg2
conn = psycopg2.connect(user = "user",
                        password = "password",
                        host = "1.1.1.1",
                        port = "1111",
                        database = "Test_1")
cur = conn.cursor()
with open('sample.csv', 'r') as f:

    cur.copy_from(f, 'abcd', sep=',', null='None')

conn.commit()

DataError: invalid input syntax for integer: ""


My expectation is for postgres to accept and update below:


Field1 (Int),Field2 (Varchar)


1,abc,,sdas""ds,dsd,a"sdasdasda
,asdasd,"",,,"""<"<"//

0 个答案:

没有答案