我有一个sql查询,可以从两个内连接表中进行选择。执行select语句大约需要50秒。但是,fetchall()需要788秒,它只获取981个结果。这是查询和fetchall代码:
time0 = time.time()
self.cursor.execute("SELECT spectrum_id, feature_table_id "+
"FROM spectrum AS s "+
"INNER JOIN feature AS f "+
"ON f.msrun_msrun_id = s.msrun_msrun_id "+
"INNER JOIN (SELECT feature_feature_table_id, min(rt) AS rtMin, max(rt) AS rtMax, min(mz) AS mzMin, max(mz) as mzMax "+
"FROM convexhull GROUP BY feature_feature_table_id) AS t "+
"ON t.feature_feature_table_id = f.feature_table_id "+
"WHERE s.msrun_msrun_id = ? "+
"AND s.scan_start_time >= t.rtMin "+
"AND s.scan_start_time <= t.rtMax "+
"AND base_peak_mz >= t.mzMin "+
"AND base_peak_mz <= t.mzMax", spectrumFeature_InputValues)
print 'query took:',time.time()-time0,'seconds'
time0 = time.time()
spectrumAndFeature_ids = self.cursor.fetchall()
print time.time()-time0,'seconds since to fetchall'
为什么胎儿需要这么长时间?
这样做的:
while 1:
info = self.cursor.fetchone()
if info:
<do something>
else:
break
和
一样慢allInfo = self.cursor.fetchall()
for info in allInfo:
<do something>
答案 0 :(得分:3)
默认情况下,由于fetchall()
对象的fetchone()
设置为1,arraysize
与Cursor
的循环一样慢。
为了加快速度,你可以循环fetchmany()
,但是为了看到性能提升,你需要为它提供一个大于1的大小参数,否则它将按批次{{{ 1}},即1。
很有可能只是通过提高arraysize
的值来获得性能提升,但我没有这方面的经验,所以你可能想要首先尝试这样做:
arraysize
有关上述内容的更多信息:http://docs.python.org/library/sqlite3.html#sqlite3.Cursor.fetchmany