我已经将数据集写入数据框。
inv.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 43839 entries, 0 to 43838
Data columns (total 16 columns):
MST_CO 43839 non-null object
LOAD_DATE 43839 non-null object
WHSE_CODE 43839 non-null object
ITEM_NO 43839 non-null object
ITEM_ID 43839 non-null int64
LOCATION 43839 non-null object
LOT_NO 43839 non-null object
LOT_STATUS 43833 non-null object
LOT_CREATED_DATE 43839 non-null datetime64[ns]
LOT_EXPIRATION_DATE 43839 non-null object
DATE_RECEIVED 43839 non-null datetime64[ns]
ONHAND_QTY1 43839 non-null float64
UOM1 43839 non-null object
ONHAND_QTY2 43418 non-null float64
UOM2 43408 non-null object
SOURCE 43839 non-null object
dtypes: datetime64[ns](2), float64(2), int64(1), object(11)
当我尝试将数据帧写入SQL时,我得到以下错误。
inv.to_sql('inventory', db2, 'MST', if_exists='append', index=False, chunksize=3000)
2015-03-30 09:33:10,656 INFO sqlalchemy.engine.base.Engine SELECT "SYSCAT"."TABLES"."TABNAME"
FROM "SYSCAT"."TABLES"
WHERE "SYSCAT"."TABLES"."TABSCHEMA" = ? AND "SYSCAT"."TABLES"."TABNAME" = ?
2015-03-30 09:33:10,656 INFO sqlalchemy.engine.base.Engine ('DWETL', 'INVENTORY')
2015-03-30 09:33:10,731 INFO sqlalchemy.engine.base.Engine
CREATE TABLE inventory (
"MST_CO" CLOB,
"LOAD_DATE" DATE,
"WHSE_CODE" CLOB,
"ITEM_NO" CLOB,
"ITEM_ID" BIGINT,
"LOCATION" CLOB,
"LOT_NO" CLOB,
"LOT_STATUS" CLOB,
"LOT_CREATED_DATE" TIMESTAMP,
"LOT_EXPIRATION_DATE" DATE,
"DATE_RECEIVED" TIMESTAMP,
"ONHAND_QTY1" FLOAT(53),
"UOM1" CLOB,
"ONHAND_QTY2" FLOAT(53),
"UOM2" CLOB,
"SOURCE" CLOB
)
2015-03-30 09:33:10,731 INFO sqlalchemy.engine.base.Engine ()
2015-03-30 09:33:10,755 INFO sqlalchemy.engine.base.Engine ROLLBACK
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 977, in to_sql
dtype=dtype)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 538, in to_sql
chunksize=chunksize, dtype=dtype)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 1176, in to_sql
table.create()
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 649, in create
self._execute_create()
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 634, in _execute_create
self.table.create()
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/schema.py", line 707, in create
checkfirst=checkfirst)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1728, in _run_visitor
conn._run_visitor(visitorcallable, element, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1357, in _run_visitor
**kwargs).traverse_single(element)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py", line 120, in traverse_single
return meth(obj, **kw)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/ddl.py", line 732, in visit_table
self.connection.execute(CreateTable(table))
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 841, in execute
return meth(self, multiparams, params)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/ddl.py", line 69, in _execute_on_connection
return connection._execute_ddl(self, multiparams, params)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 895, in _execute_ddl
compiled
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1070, in _execute_context
context)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1271, in _handle_dbapi_exception
exc_info
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1063, in _execute_context
context)
File "/usr/local/lib/python2.7/dist-packages/ibm_db_sa/ibm_db.py", line 106, in do_execute
cursor.execute(statement, parameters)
File "/usr/local/lib/python2.7/dist-packages/ibm_db_dbi.py", line 1335, in execute
self._execute_helper(parameters)
File "/usr/local/lib/python2.7/dist-packages/ibm_db_dbi.py", line 1247, in _execute_helper
raise self.messages[len(self.messages) - 1]
sqlalchemy.exc.ProgrammingError: (ProgrammingError) ibm_db_dbi::ProgrammingError: Statement Execute Failed: [IBM][CLI Driver][DB2/LINUXX8664] SQL1666N The table definition statement failed because some functionality was specified in the table definition that is not supported with the table type. Unsupported functionality: "CLOB". SQLSTATE=42613 SQLCODE=-1666 '\nCREATE TABLE inventory (\n\t"MST_CO" CLOB, \n\t"LOAD_DATE" DATE, \n\t"WHSE_CODE" CLOB, \n\t"ITEM_NO" CLOB, \n\t"ITEM_ID" BIGINT, \n\t"LOCATION" CLOB, \n\t"LOT_NO" CLOB, \n\t"LOT_STATUS" CLOB, \n\t"LOT_CREATED_DATE" TIMESTAMP, \n\t"LOT_EXPIRATION_DATE" DATE, \n\t"DATE_RECEIVED" TIMESTAMP, \n\t"ONHAND_QTY1" FLOAT(53), \n\t"UOM1" CLOB, \n\t"ONHAND_QTY2" FLOAT(53), \n\t"UOM2" CLOB, \n\t"SOURCE" CLOB\n)\n\n' ()
我怎样才能让SQLAlchemy和Pandas玩得很好?我只需将CLOB转换为STR即可。谢谢!
答案 0 :(得分:2)
您可以使用dtype
关键字参数指定要用于特定列的SQL类型(请参阅docs):
from sqlalchemy.types import String
inv.to_sql('inventory', db2, dtype={'col_name': String})
默认情况下,pandas使用TEXT
type作为对象/字符串列,sqlalchemy将其映射到CLOB
或TEXT
,但显然您的数据库不知道此类型。因此,通过上述操作,您可以手动指定数据库确实知道的类型。
您似乎有多列。因此,要为所有对象的dtyped列自动进行此映射。您可以使用dtype对象选择列:
cols = df.dtypes[df.dtypes=='object'].index
type_mapping = {col : String for col in cols }
inv.to_sql('inventory', db2, dtype=type_mapping)