python - 使用pandasql优化查询

根据我的业务需求，我需要使用pandasql检索数据。我在我的代码中使用了大约4个查询。我的基本数据大小是2000000

我在我的代码中使用了以下类型的查询。请注意，此处提供的变量名称是虚拟变量，但语法相同。

import pandasql as pdsql    
str1="""select distinct class,year,section,student_name  from student_data where class=%d and year='%s'"""
        str2=str1%(class,year)
        pysql = lambda q: pdsql.sqldf(q, globals())
        df1 = pysql(str2)

目前，代码需要5分30秒才能执行逻辑。有没有办法在python 3.x中使用pandasql优化它？

使用pandasql优化查询

0 个答案: