为什么pandas.read_sql返回一个空的DataFrame?

时间:2017-10-24 15:13:33

标签: python-3.x pandas pandas-datareader

我正在尝试从数据库中检索数据并保存到pandas.DataFrame中。 这是我的Python脚本,

conn = pyodbc.connect(sql_server)
query = '''SELECT a1, a2, a3
FROM '''  + dbschema + '''.SomeResults
WHERE FactorName = \' ''' + FactorName + ''' \' AND parametername = 'Param1' ORDER BY Factor1 '''
df = pd.read_sql(query, conn)
print(df)

然而,它返回,

Empty DataFrame
Columns: [a1, a2, a3]
Index: []

我很确定这不是SQL问题,因为我可以使用conn.cursor()从数据库中检索数据。

1 个答案:

答案 0 :(得分:2)

原因是生成SQL的方式:

In [307]: dbschema = 'db'

In [308]: FactorName = 'Factor1'

In [309]: query = '''SELECT a1, a2, a3
     ...: FROM '''  + dbschema + '''.SomeResults
     ...: WHERE FactorName = \' ''' + FactorName + ''' \' AND parametername = 'Param1' ORDER BY Factor1 '''

In [310]: print(query)
SELECT a1, a2, a3
FROM db.SomeResults
WHERE FactorName = ' Factor1 ' AND parametername = 'Param1' ORDER BY Factor1

# NOTE: spaces      ^       ^

您不应该以这种方式生成SQL,因为它可能很危险(请阅读SQL injections)。

这将是一种正确的方式:

query = """
SELECT a1, a2, a3
FROM {}.SomeResults
WHERE FactorName = ? AND parametername = 'Param1'
ORDER BY Factor1
"""

df = pd.read_sql(query.format(dbschema), conn, params=(FactorName,))

注意:只能参数化文字。即我们不能参数化模式名称,表名,列,名称等。

这是一个有趣的SQL注入示例:

enter image description here