希望使用Pandas中的read_csv
模块将数据提取到PostgreSQL(UTF-8)中,从CSV(UTF-8)读入Pandas。
代码段:
# -*- coding: utf-8-*-
import sys
reload(sys)
sys.setdefaultencoding("utf-8")
import pandas as pd
from datetime import datetime
from datetime import dateimport locale
locale.getpreferredencoding(True)
df = pd.read_csv(full_path, encoding='utf-8-sig', sep=';')
for col in date_cols:
if col in df.columns:
for index, row in df.iterrows():
date_con = df.ix[index, col]
if isinstance(date_con, unicode):
py_date = datetime.strptime(str(date_con), '%d.%m.%Y')
if py_date > pd.Timestamp.max:
df.ix[index, col] = pd.Timestamp.max
else:
x = py_date
df.ix[index, col] = x.date()
dict_items = df.to_dict(orient='records')
connection.execute(table.insert(), dict_items)
错误信息是:
(<class 'sqlalchemy.exc.DataError'>, DataError('(psycopg2.DataError) invalid input syntax for type date: ""\nLINE 1: ...hsing@redline.de\', \'1983-03-12\'::date, \'D\', \'\', \'\', \'\'...\n ^\n',), <traceback object at 0x1100ef050>)
dict_items
的一行示例:
{u'fname': u'Henry', u'sback':'', u'birthdt': datetime.date(1983, 3, 12), 'input': datetime.date(2017, 12, 27), u'email': u'hsing@redline.de', u'lname': u'Sting', u'country': u'DE', u'date_end': datetime.date(2019, 11, 12)}
我无法弄清楚这里的日期语法有什么不对。我可以检查或更改的任何想法?