我正在尝试将pandas数据帧上传到SQL表。在我看来,pandas to_sql函数是更大数据帧的最佳解决方案,但我无法让它工作。我可以轻松地提取数据,但在尝试将其写入新表时会收到错误消息:
# connect to Exasol DB
exaString='DSN=exa'
conDB = pyodbc.connect(exaString)
# get some data from somewhere, works without error
sqlString = "SELECT * FROM SOMETABLE"
data = pd.read_sql(sqlString, conDB)
# now upload this data to a new table
data.to_sql('MYTABLENAME', conDB, flavor='mysql')
conDB.close()
我收到的错误消息是
pyodbc.ProgrammingError:('42000',“[42000] [EXASOL] [EXASolution driver]语法错误,意外的identifier_chain2,期待 assignment_operator或':'[第1行,第6列]( - 1) (SQLExecDirectW)“)
不幸的是,我不知道导致此语法错误的查询是什么样的,或者其他什么是错误的。有人可以指点我正确的方向吗?
(第二次)编辑:
根据Humayuns和Joris的建议,我现在使用Pandas版本0.14和SQLAlchemy与Exasol方言(?)结合使用。由于我连接到已定义的模式,我使用元数据选项,但程序崩溃时出现“总线错误(核心转储)”。
engine = create_engine('exa+pyodbc://uid:passwd@exa/mySchemaName', echo=True)
# get some data
sqlString = "SELECT * FROM SOMETABLE" # SOMETABLE is a view in mySchemaName
df = pd.read_sql(sqlString, con=engine) # works
print engine.has_table('MYTABLENAME') # MYTABLENAME is a view in mySchemaName
# prints "True"
# upload it to a new table
meta = sqlalchemy.MetaData(engine, schema='mySchemaName')
meta.reflect(engine, schema='mySchemaName')
pdsql = sql.PandasSQLAlchemy(engine, meta=meta)
pdsql.to_sql(df, 'MYTABLENAME')
我不确定在create_engine(..)中设置“mySchemaName”,但结果是一样的。
答案 0 :(得分:1)
Pandas不支持开箱即用的EXASOL语法,所以需要稍微更改一下,这是一个没有SQLAlchemy的代码的工作示例:
import pyodbc
import pandas as pd
con = pyodbc.connect('DSN=EXA')
con.execute('OPEN SCHEMA TEST2')
# configure pandas to understand EXASOL as mysql flavor
pd.io.sql._SQL_TYPES['int']['mysql'] = 'INT'
pd.io.sql._SQL_SYMB['mysql']['br_l'] = ''
pd.io.sql._SQL_SYMB['mysql']['br_r'] = ''
pd.io.sql._SQL_SYMB['mysql']['wld'] = '?'
pd.io.sql.PandasSQLLegacy.has_table = \
lambda self, name: name.upper() in [t[0].upper() for t in con.execute('SELECT table_name FROM cat').fetchall()]
data = pd.read_sql('SELECT * FROM services', con)
data.to_sql('SERVICES2', con, flavor = 'mysql', index = False)
如果您使用EXASolution Python包,那么代码如下所示:
import exasol
con = exasol.connect(dsn='EXA') # normal pyodbc connection with additional functions
con.execute('OPEN SCHEMA TEST2')
data = con.readData('SELECT * FROM services') # pandas data frame per default
con.writeData(data, table = 'services2')
答案 1 :(得分:1)
问题是,在pandas 0.14中,read_sql和to_sql函数也无法处理模式,但使用没有模式的exasol是没有意义的。这将固定为0.15。如果您想立即使用它,请查看此拉取请求https://github.com/pydata/pandas/pull/7952