带有多个参数和列表的熊猫read_sql

时间:2019-09-06 16:51:33

标签: python sql pandas filter

我有以下脚本:

now = dt.datetime.now()
date_filter = now - timedelta(days=3)
list_ids = [1,2,3]
dq_connection = mysql.connector.connect(user='user', password='pass', host='localhost', database='db')
engine = create_engine('localhost/db')
cursor = connection.cursor(buffered=True)
query = ('''
SELECT *
FROM (SELECT * FROM myTable1 WHERE id in {%s}
WHERE date >= %s;
''')
df = pd.read_sql_query(query, connection,params=(list_ids,date_filter,))

我想对查询使用两个过滤器: 1)列出我在list_ids上拥有的所有ID 2)仅过滤date_filter之前的日期。

第二个过滤器我可以做到,但是当我尝试使用列表时,我得到了:

pandas.io.sql.DatabaseError: Execution failed on sql

我做错了什么?

1 个答案:

答案 0 :(得分:1)

由于IN子句接收多个值,因此需要使用必需数量的占位符%s来调整准备好的语句,然后使用func(*list)解压缩参数列表。另外,两个WHERE子句都不需要子查询。

query = '''SELECT * FROM myTable1 
           WHERE id in (%s, %s, %s) AND date >= %s;
        '''

df = pd.read_sql_query(query, connection, params=(*list_ids, date_filter))

对于等于列表长度的动态占位符,请集成str.join

placeholders = ", ".join(["%s" for _ in list_ids])

query = '''SELECT * FROM myTable1 
           WHERE id in ({}) AND date >= %s;
        '''.format(placeholders)

df = pd.read_sql_query(query, connection, params=(*list_ids, date_filter))