python pandas与to_sql(),SQLAlchemy和exasol中的模式

时间:2014-08-04 10:01:14

标签: python sql pandas sqlalchemy

我正在尝试将pandas数据帧上传到SQL表。在我看来,pandas to_sql函数是更大数据帧的最佳解决方案,但我无法让它工作。我可以轻松地提取数据,但在尝试将其写入新表时会收到错误消息:

# connect to Exasol DB
exaString='DSN=exa'
conDB = pyodbc.connect(exaString)   

# get some data from somewhere, works without error
sqlString = "SELECT * FROM SOMETABLE"
data = pd.read_sql(sqlString, conDB)

# now upload this data to a new table
data.to_sql('MYTABLENAME', conDB, flavor='mysql')

conDB.close()

我收到的错误消息是

  

pyodbc.ProgrammingError:('42000',“[42000] [EXASOL] [EXASolution driver]语法错误,意外的identifier_chain2,期待   assignment_operator或':'[第1行,第6列]( - 1)   (SQLExecDirectW)“)

不幸的是,我不知道导致此语法错误的查询是什么样的,或者其他什么是错误的。有人可以指点我正确的方向吗?

(第二次)编辑:

根据Humayuns和Joris的建议,我现在使用Pandas版本0.14和SQLAlchemy与Exasol方言(?)结合使用。由于我连接到已定义的模式,我使用元数据选项,但程序崩溃时出现“总线错误(核心转储)”。

engine = create_engine('exa+pyodbc://uid:passwd@exa/mySchemaName', echo=True)    

# get some data
sqlString = "SELECT * FROM SOMETABLE"    # SOMETABLE is a view in mySchemaName 
df = pd.read_sql(sqlString, con=engine)  # works

print engine.has_table('MYTABLENAME')    # MYTABLENAME is a view in mySchemaName
# prints "True"

# upload it to a new table
meta = sqlalchemy.MetaData(engine, schema='mySchemaName')
meta.reflect(engine, schema='mySchemaName')
pdsql = sql.PandasSQLAlchemy(engine, meta=meta)
pdsql.to_sql(df, 'MYTABLENAME')

我不确定在create_engine(..)中设置“mySchemaName”,但结果是一样的。

2 个答案:

答案 0 :(得分:1)

Pandas不支持开箱即用的EXASOL语法,所以需要稍微更改一下,这是一个没有SQLAlchemy的代码的工作示例:

import pyodbc
import pandas as pd

con = pyodbc.connect('DSN=EXA')
con.execute('OPEN SCHEMA TEST2')

# configure pandas to understand EXASOL as mysql flavor
pd.io.sql._SQL_TYPES['int']['mysql'] = 'INT'
pd.io.sql._SQL_SYMB['mysql']['br_l'] = ''
pd.io.sql._SQL_SYMB['mysql']['br_r'] = ''
pd.io.sql._SQL_SYMB['mysql']['wld'] = '?'
pd.io.sql.PandasSQLLegacy.has_table = \
    lambda self, name: name.upper() in [t[0].upper() for t in con.execute('SELECT table_name FROM cat').fetchall()]

data = pd.read_sql('SELECT * FROM services', con)
data.to_sql('SERVICES2', con, flavor = 'mysql', index = False)

如果您使用EXASolution Python包,那么代码如下所示:

import exasol
con = exasol.connect(dsn='EXA') # normal pyodbc connection with additional functions
con.execute('OPEN SCHEMA TEST2')

data = con.readData('SELECT * FROM services') # pandas data frame per default
con.writeData(data, table = 'services2')

答案 1 :(得分:1)

问题是,在pandas 0.14中,read_sql和to_sql函数也无法处理模式,但使用没有模式的exasol是没有意义的。这将固定为0.15。如果您想立即使用它,请查看此拉取请求https://github.com/pydata/pandas/pull/7952