Question

我有一个大约250 000行的sqlite3数据库表，我的代码是用python编写的。我需要在非常特定的扫描仪中对其进行过滤，这需要花费很长时间。

表格如下：

self.cur.execute("""create table DetectedVehicles(IdD INTEGER PRIMARY KEY, 
                                    CLCode INT, 
                                    DetectionTime INT,
                                    PlateNo VARCHAR)""")

这是一个自动板数识别结果过滤表。我需要过滤它来获取（本机类似于sql的语句:)）：

Get rows from table DetectedVehicles where vehicles were observed at 
CLCode="X" before they were observed at CLCode="Y". 
(implicite: they were observed at both of them)

所以我需要得到检测到的车辆列表，它们按照正确的顺序越过特定的CLCode，即X之前的Y.

我设法创建了一些有效的东西，但查询大约需要10秒。有更快的方法吗？

代码在这里：

self.cur.execute('select distinct PlateNo from DetectedVehicles where CLCode=? intersect select PlateNo from DetectedVehicles where CLCode=?',(CountLocationNo[0],CountLocationNo[1]))
    PlatesTab=list(self.cur)
    Results=[]
    for Plate in PlatesTab:
        PlateQ1='select * from DetectedVehicles where PlateNo in (?) and ((select DetectionTime from DetectedVehicles where CLCode = ? and PlateNo in (?) ) <  (select DetectionTime from DetectedVehicles where CLCode = ? and PlateNo in (?)))'     
        R=list(self.cur.execute(PlateQ1,(Plate,CountLocationNo[0],Plate,CountLocationNo[1],Plate)))
        if R:
            TimesOD=self.curST2.execute('select DetectionTime from DetectedVehicles where PlateNo in (?) and (CLCode= ? or CLCode=?)',(Plate,CountLocationNo[0],CountLocationNo[1])).fetchall()
            if TimesOD:
               TravelTimes.append(TimesOD[1][0]-TimesOD[0][0])
               DetectionTimes.append(TimesOD[0][0])
            for i in R:
                Results.append(i[0])
    Results=tuple(Results)
    QueryCL=' intersect select * from DetectedVehicles where IDd in ' + str(Results)

提前致谢

Answer 1

您可以在一个查询中完成所有操作。

select 
    dv1.PlateNo, dvPoint1.DetectionTime, dvPoint2.DetectionTime
from 
    DetectedVehicles dvPoint1 
    inner join DetectedVehicles dvPoint2
        on dvPoint1.PlateNo = dvPoint2.PlateNo
        and dvPoint1.CLCode = ? and dvPoint2.CLCode = ?
        and dvPoint1.DetectionTime < dvPoint2.DetectionTime

您需要（PlateNo，DetectionTime，CLCode）或（CLCode，PlateNo）上的索引。试试看两者哪个更快。 PlateNo on the own自己也可以。

Answer 2

尝试：

select distinct x.*
from DetectedVehicles x
join DetectedVehicles y
  on x.PlateNo = y.PlateNo and 
     x.DetectionTime < y.DetectionTime
where x.CLCode=? and y.CLCode=?

或：

select x.*
from DetectedVehicles x
where exists
(select 1
 from DetectedVehicles y
 where x.PlateNo = y.PlateNo and 
       x.DetectionTime < y.DetectionTime and
       x.CLCode=? and y.CLCode=?)

我通常希望后一个查询执行得更快，但是要检查它们是值得的。

Answer 3

谢谢你们的反馈。我将其作为答案发布，并显示时间结果：

<强> 1。最快总计（查询1.80秒，购买0.20秒，总计：2秒）

select distinct x.*
from DetectedVehicles x
join DetectedVehicles y
  on x.PlateNo = y.PlateNo and 
     x.DetectionTime < y.DetectionTime
where x.CLCode=? and y.CLCode=?

<强> 2。（查询1.83s，fetchall 0.19s，总计：2.02s）

select 
    dvPoint1.PlateNo, dvPoint1.DetectionTime, dvPoint2.DetectionTime
from 
    DetectedVehicles dvPoint1 
    inner join DetectedVehicles dvPoint2
        on dvPoint1.PlateNo = dvPoint2.PlateNo
        and dvPoint1.CLCode = ? and dvPoint2.CLCode = ?
        and dvPoint1.DetectionTime < dvPoint2.DetectionTime

第3。（查询1.82s，fetchall 1.09s，总计：2.91s）

select x.*
from DetectedVehicles x
where exists
(select 1
 from DetectedVehicles y
 where x.PlateNo = y.PlateNo and 
       x.DetectionTime < y.DetectionTime and
       x.CLCode=? and y.CLCode=?)

非常感谢@Mark Bannister的回答，我会接受它。

然而，仍有一个问题： cur.fetchall（）需要很长时间..而且我需要得到结果，我应该怎么做？（对于每100个行，每个解决方案大约需要2分钟）。解决了问题：将新的sqlite.dll下载到你的python / dlls文件夹...不要问我为什么：Join with Pythons sqlite module is slower than doing it manually

SQLite3查询需要花费很多时间，你会怎么做？

3 个答案: