SQLALCHEMY / PANDAS - SQLAlchemy读取列为Pandas to_sql的CLOB

时间:2015-03-30 14:41:34

标签: python pandas sqlalchemy

我已经将数据集写入数据框。

inv.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 43839 entries, 0 to 43838
Data columns (total 16 columns):
MST_CO                 43839 non-null object
LOAD_DATE              43839 non-null object
WHSE_CODE              43839 non-null object
ITEM_NO                43839 non-null object
ITEM_ID                43839 non-null int64
LOCATION               43839 non-null object
LOT_NO                 43839 non-null object
LOT_STATUS             43833 non-null object
LOT_CREATED_DATE       43839 non-null datetime64[ns]
LOT_EXPIRATION_DATE    43839 non-null object
DATE_RECEIVED          43839 non-null datetime64[ns]
ONHAND_QTY1            43839 non-null float64
UOM1                   43839 non-null object
ONHAND_QTY2            43418 non-null float64
UOM2                   43408 non-null object
SOURCE                 43839 non-null object
dtypes: datetime64[ns](2), float64(2), int64(1), object(11)

当我尝试将数据帧写入SQL时,我得到以下错误。

inv.to_sql('inventory', db2, 'MST', if_exists='append', index=False, chunksize=3000)
2015-03-30 09:33:10,656 INFO sqlalchemy.engine.base.Engine SELECT "SYSCAT"."TABLES"."TABNAME" 
FROM "SYSCAT"."TABLES" 
WHERE "SYSCAT"."TABLES"."TABSCHEMA" = ? AND "SYSCAT"."TABLES"."TABNAME" = ?
2015-03-30 09:33:10,656 INFO sqlalchemy.engine.base.Engine ('DWETL', 'INVENTORY')
2015-03-30 09:33:10,731 INFO sqlalchemy.engine.base.Engine 
CREATE TABLE inventory (
"MST_CO" CLOB, 
"LOAD_DATE" DATE, 
"WHSE_CODE" CLOB, 
"ITEM_NO" CLOB, 
"ITEM_ID" BIGINT, 
"LOCATION" CLOB, 
"LOT_NO" CLOB, 
"LOT_STATUS" CLOB, 
"LOT_CREATED_DATE" TIMESTAMP, 
"LOT_EXPIRATION_DATE" DATE, 
"DATE_RECEIVED" TIMESTAMP, 
"ONHAND_QTY1" FLOAT(53), 
"UOM1" CLOB, 
"ONHAND_QTY2" FLOAT(53), 
"UOM2" CLOB, 
"SOURCE" CLOB
)
2015-03-30 09:33:10,731 INFO sqlalchemy.engine.base.Engine ()
2015-03-30 09:33:10,755 INFO sqlalchemy.engine.base.Engine ROLLBACK
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 977, in to_sql
dtype=dtype)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 538, in to_sql
chunksize=chunksize, dtype=dtype)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 1176, in to_sql
table.create()
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 649, in create
self._execute_create()
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 634, in _execute_create
self.table.create()
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/schema.py", line 707, in create
checkfirst=checkfirst)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1728, in _run_visitor
conn._run_visitor(visitorcallable, element, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1357, in _run_visitor
**kwargs).traverse_single(element)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py", line 120, in traverse_single
return meth(obj, **kw)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/ddl.py", line 732, in visit_table
self.connection.execute(CreateTable(table))
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 841, in execute
return meth(self, multiparams, params)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/ddl.py", line 69, in _execute_on_connection
return connection._execute_ddl(self, multiparams, params)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 895, in _execute_ddl
compiled
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1070, in _execute_context
context)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1271, in _handle_dbapi_exception
exc_info
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1063, in _execute_context
context)
File "/usr/local/lib/python2.7/dist-packages/ibm_db_sa/ibm_db.py", line 106, in do_execute
cursor.execute(statement, parameters)
File "/usr/local/lib/python2.7/dist-packages/ibm_db_dbi.py", line 1335, in execute
self._execute_helper(parameters)
File "/usr/local/lib/python2.7/dist-packages/ibm_db_dbi.py", line 1247, in _execute_helper
raise self.messages[len(self.messages) - 1]
sqlalchemy.exc.ProgrammingError: (ProgrammingError) ibm_db_dbi::ProgrammingError: Statement Execute Failed: [IBM][CLI Driver][DB2/LINUXX8664] SQL1666N  The table definition statement failed because some functionality was specified in the table definition that is not supported with the table type.  Unsupported functionality: "CLOB".  SQLSTATE=42613 SQLCODE=-1666 '\nCREATE TABLE inventory (\n\t"MST_CO" CLOB, \n\t"LOAD_DATE" DATE, \n\t"WHSE_CODE" CLOB, \n\t"ITEM_NO" CLOB, \n\t"ITEM_ID" BIGINT, \n\t"LOCATION" CLOB, \n\t"LOT_NO" CLOB, \n\t"LOT_STATUS" CLOB, \n\t"LOT_CREATED_DATE" TIMESTAMP, \n\t"LOT_EXPIRATION_DATE" DATE, \n\t"DATE_RECEIVED" TIMESTAMP, \n\t"ONHAND_QTY1" FLOAT(53), \n\t"UOM1" CLOB, \n\t"ONHAND_QTY2" FLOAT(53), \n\t"UOM2" CLOB, \n\t"SOURCE" CLOB\n)\n\n' ()

我怎样才能让SQLAlchemy和Pandas玩得很好?我只需将CLOB转换为STR即可。谢谢!

1 个答案:

答案 0 :(得分:2)

您可以使用dtype关键字参数指定要用于特定列的SQL类型(请参阅docs):

from sqlalchemy.types import String
inv.to_sql('inventory', db2, dtype={'col_name': String})

默认情况下,pandas使用TEXT type作为对象/字符串列,sqlalchemy将其映射到CLOBTEXT,但显然您的数据库不知道此类型。因此,通过上述操作,您可以手动指定数据库确实知道的类型。

您似乎有多列。因此,要为所有对象的dtyped列自动进行此映射。您可以使用dtype对象选择列:

cols = df.dtypes[df.dtypes=='object'].index
type_mapping = {col : String for col in cols }
inv.to_sql('inventory', db2, dtype=type_mapping)