我的基础架构位于Microsoft Azure环境中。另外,我在本地网络上有一个Ubuntu虚拟机。
首先,我在SQL Management Studio中执行一个查询,它可以在不到一秒的时间内完成。
然后我尝试使用来自Ubuntu的pymssql
执行相同的查询,大约需要50秒!
import pymssql
import pandas as pd
qtext = '''
IF OBJECT_ID('tempdb..#ids') IS NULL
SELECT [RichTrackId]
,max(PointDate) as max_date
into #ids
FROM [MobileServiceStage].[dbo].[RichTrackPoints] (nolock)
where DeviceToken = 12345
group by [RichTrackId]
select
rtp.[RichTrackId],
max(Latitude) as Latitude,
max(Longitude) as Longitude
from [MobileServiceStage].[dbo].[RichTrackPoints] as rtp (nolock)
inner join #ids
on rtp.DeviceToken = %(user)s and
rtp.[RichTrackId] = #ids.[RichTrackId] and rtp.PointDate = #ids.max_date
group by rtp.[RichTrackId], rtp.PointDate
order by rtp.PointDate
'''
conn = pymssql.connect(server=server, user=user, password=pas)
df = pd.read_sql(qtext, conn)
我的探查器显示问题出在pymssql的execute()
方法
ncalls tottime percall cumtime percall filename:lineno(function)
1 51.896 51.896 51.896 51.896 {method 'execute' of 'pymssql.Cursor' objects}
怎么可能?我认为查询文本不是瓶颈,查询在SQL Management Studio中进行了优化和测试。