从使用绑定变量的数据库查询创建pandas数据帧

时间:2013-02-14 21:48:54

标签: python sql database oracle pandas

我正在使用Oracle数据库。我可以这么做:

    import pandas as pd
    import pandas.io.sql as psql
    import cx_Oracle as odb
    conn = odb.connect(_user +'/'+ _pass +'@'+ _dbenv)

    sqlStr = "SELECT * FROM customers"
    df = psql.frame_query(sqlStr, conn)

但我不知道如何处理绑定变量,如下所示:

    sqlStr = """SELECT * FROM customers 
                WHERE id BETWEEN :v1 AND :v2
             """

我尝试过这些变化:

   params  = (1234, 5678)
   params2 = {"v1":1234, "v2":5678}

   df = psql.frame_query((sqlStr,params), conn)
   df = psql.frame_query((sqlStr,params2), conn)
   df = psql.frame_query(sqlStr,params, conn)
   df = psql.frame_query(sqlStr,params2, conn)

以下作品:

   curs = conn.cursor()
   curs.execute(sqlStr, params)
   df = pd.DataFrame(curs.fetchall())
   df.columns = [rec[0] for rec in curs.description]

但这个解决方案只是......不太优雅。如果可以的话,我想在不创建游标对象的情况下这样做。有没有办法用大熊猫来做整件事?

2 个答案:

答案 0 :(得分:1)

据我所知,pandas希望SQL字符串在传递之前完全形成。考虑到这一点,我会(并且总是这样)使用字符串插值:

params = (1234, 5678)
sqlStr = """
SELECT * FROM customers 
WHERE id BETWEEN %d AND %d
""" % params
print(sqlStr)

给出了

SELECT * FROM customers 
WHERE id BETWEEN 1234 AND 5678

所以这应该很好地提供给psql.frame_query。 (根据我对postgres,mysql和sql server的使用经验)。

答案 1 :(得分:1)

尝试使用pandas.io.sql.read_sql_query。我使用了pandas版本0.20.1,我用它,它解决了:

import pandas as pd
import pandas.io.sql as psql
import cx_Oracle as odb
conn = odb.connect(_user +'/'+ _pass +'@'+ _dbenv)

sqlStr = """SELECT * FROM customers 
            WHERE id BETWEEN :v1 AND :v2
"""
pars = {"v1":1234, "v2":5678}
df = psql.frame_query(sqlStr, conn, params=pars)