我正在尝试使用copy_from命令(在postgres中使用复制命令的函数)在类似csv的结构中将数据行加载到postgres中。我的数据用逗号分隔(不幸的是,因为我不是数据所有者,所以我不能只更改分隔符)。当我尝试加载一个包含逗号的引号值的行时(即该逗号不应被视为分隔符),我遇到了一个问题。
例如,这一行数据很好:
",Madrid,SN,,SEN,,,SN,173,157"
这行数据并不合适:
","Dominican, Republic of",MC,,YUO,,,MC,65,162",
一些代码:
conn = get_psycopg_conn()
cur = conn.cursor()
_io_buffer.seek(0) #This buffer is holding the csv-like data
cur.copy_from(_io_buffer, str(table_name), sep=',', null='', columns=column_names)
conn.commit()
答案 0 :(得分:16)
It looks like copy_from
doesn't expose the csv
mode or quote
options,are available form the underlying PostgreSQL COPY
command。因此,您需要修补psycopg2以添加它们,或use copy_expert
。
我没有尝试过,但有点像
curs.copy_expert("""COPY mytable FROM STDIN WITH (FORMAT CSV)""", _io_buffer)
可能就足够了。
答案 1 :(得分:0)
我遇到了同样的错误,并且能够根据craig-ringer列出的单行代码接近修复程序。我需要的另一个项目是使用df.to_csv(index=False,header=False, quoting=csv.QUOTE_NONNUMERIC,sep=',')
并特别是, quoting=csv.QUOTE_NONNUMERIC
包含初始对象的引号。
从MySQL中提取一个数据源并将其存储在Postgres中的完整示例如下:
#run in python 3.6
import MySQLdb
import psycopg2
import os
from io import StringIO
import pandas as pd
import csv
mysql_db = MySQLdb.connect(host="host_address",# your host, usually localhost
user="user_name", # your username
passwd="source_pw", # your password
db="source_db") # name of the data base
postgres_db = psycopg2.connect("host=dest_address dbname=dest_db_name user=dest_user password=dest_pw")
my_list = ['1','2','3','4']
# you must create a Cursor object. It will let you execute all the queries you need
mysql_cur = mysql_db.cursor()
postgres_cur = postgres_db.cursor()
for item in my_list:
# Pull cbi data for each state and write it to postgres
print(item)
mysql_sql = 'select * from my_table t \
where t.important_feature = \'' + item + '\';'
# Do something to create your dataframe here...
df = pd.read_sql_query(mysql_sql, mysql_db)
# Initialize a string buffer
sio = StringIO()
sio.write(df.to_csv(index=False,header=False, quoting=csv.QUOTE_NONNUMERIC,sep=',')) # Write the Pandas DataFrame as a csv to the buffer
sio.seek(0) # Be sure to reset the position to the start of the stream
# Copy the string buffer to the database, as if it were an actual file
with postgres_db.cursor() as c:
print(c)
c.copy_expert("""COPY schema:new_table FROM STDIN WITH (FORMAT CSV)""", sio)
postgres_db.commit()
mysql_db.close()
postgres_db.close()