我有一个大约250 000行的sqlite3数据库表,我的代码是用python编写的。我需要在非常特定的扫描仪中对其进行过滤,这需要花费很长时间。
表格如下:
self.cur.execute("""create table DetectedVehicles(IdD INTEGER PRIMARY KEY,
CLCode INT,
DetectionTime INT,
PlateNo VARCHAR)""")
这是一个自动板数识别结果过滤表。 我需要过滤它来获取(本机类似于sql的语句:)):
Get rows from table DetectedVehicles where vehicles were observed at
CLCode="X" before they were observed at CLCode="Y".
(implicite: they were observed at both of them)
所以我需要得到检测到的车辆列表,它们按照正确的顺序越过特定的CLCode,即X之前的Y.
我设法创建了一些有效的东西,但查询大约需要10秒。有更快的方法吗?
代码在这里:
self.cur.execute('select distinct PlateNo from DetectedVehicles where CLCode=? intersect select PlateNo from DetectedVehicles where CLCode=?',(CountLocationNo[0],CountLocationNo[1]))
PlatesTab=list(self.cur)
Results=[]
for Plate in PlatesTab:
PlateQ1='select * from DetectedVehicles where PlateNo in (?) and ((select DetectionTime from DetectedVehicles where CLCode = ? and PlateNo in (?) ) < (select DetectionTime from DetectedVehicles where CLCode = ? and PlateNo in (?)))'
R=list(self.cur.execute(PlateQ1,(Plate,CountLocationNo[0],Plate,CountLocationNo[1],Plate)))
if R:
TimesOD=self.curST2.execute('select DetectionTime from DetectedVehicles where PlateNo in (?) and (CLCode= ? or CLCode=?)',(Plate,CountLocationNo[0],CountLocationNo[1])).fetchall()
if TimesOD:
TravelTimes.append(TimesOD[1][0]-TimesOD[0][0])
DetectionTimes.append(TimesOD[0][0])
for i in R:
Results.append(i[0])
Results=tuple(Results)
QueryCL=' intersect select * from DetectedVehicles where IDd in ' + str(Results)
提前致谢
答案 0 :(得分:2)
您可以在一个查询中完成所有操作。
select
dv1.PlateNo, dvPoint1.DetectionTime, dvPoint2.DetectionTime
from
DetectedVehicles dvPoint1
inner join DetectedVehicles dvPoint2
on dvPoint1.PlateNo = dvPoint2.PlateNo
and dvPoint1.CLCode = ? and dvPoint2.CLCode = ?
and dvPoint1.DetectionTime < dvPoint2.DetectionTime
您需要(PlateNo,DetectionTime,CLCode)或(CLCode,PlateNo)上的索引。试试看两者哪个更快。 PlateNo on the own自己也可以。
答案 1 :(得分:1)
尝试:
select distinct x.*
from DetectedVehicles x
join DetectedVehicles y
on x.PlateNo = y.PlateNo and
x.DetectionTime < y.DetectionTime
where x.CLCode=? and y.CLCode=?
或:
select x.*
from DetectedVehicles x
where exists
(select 1
from DetectedVehicles y
where x.PlateNo = y.PlateNo and
x.DetectionTime < y.DetectionTime and
x.CLCode=? and y.CLCode=?)
我通常希望后一个查询执行得更快,但是要检查它们是值得的。
答案 2 :(得分:0)
谢谢你们的反馈。 我将其作为答案发布,并显示时间结果:
<强> 1。最快总计(查询1.80秒,购买0.20秒,总计:2秒)
select distinct x.*
from DetectedVehicles x
join DetectedVehicles y
on x.PlateNo = y.PlateNo and
x.DetectionTime < y.DetectionTime
where x.CLCode=? and y.CLCode=?
<强> 2。 (查询1.83s,fetchall 0.19s,总计:2.02s)
select
dvPoint1.PlateNo, dvPoint1.DetectionTime, dvPoint2.DetectionTime
from
DetectedVehicles dvPoint1
inner join DetectedVehicles dvPoint2
on dvPoint1.PlateNo = dvPoint2.PlateNo
and dvPoint1.CLCode = ? and dvPoint2.CLCode = ?
and dvPoint1.DetectionTime < dvPoint2.DetectionTime
第3。 (查询1.82s,fetchall 1.09s,总计:2.91s)
select x.*
from DetectedVehicles x
where exists
(select 1
from DetectedVehicles y
where x.PlateNo = y.PlateNo and
x.DetectionTime < y.DetectionTime and
x.CLCode=? and y.CLCode=?)
非常感谢@Mark Bannister的回答,我会接受它。
然而,仍有一个问题: cur.fetchall()需要很长时间..而且我需要得到结果,我应该怎么做? (对于每100个行,每个解决方案大约需要2分钟)。 解决了问题:将新的sqlite.dll下载到你的python / dlls文件夹...不要问我为什么:Join with Pythons sqlite module is slower than doing it manually