query.next()慢吗?

时间:2015-11-09 10:20:35

标签: python performance loops iterator pyqt

我使用PyQt作为GUI软件。我还使用sqlite数据库为数据提供软件。

在我的代码的某处,我有这个方法:

def loadNotifications(self):

    """Method to find the number of unread articles,
    for each search. Load a list of id, for the unread articles,
    in each table. And a list of id, for the concerned articles, for
    each table"""

    count_query = QtSql.QSqlQuery(self.bdd)
    count_query.setForwardOnly(True)

    # Don't treat the articles if it's the main tab, it's
    # useless because the article will be concerned for sure
    for table in self.list_tables_in_tabs[1:]:

        # Empty these lists, because when loadNotifications is called
        # several times during the use, the nbr of unread articles is
        # added to the nbr of notifications
        table.list_new_ids = []
        table.list_id_articles = []

        # Try to speed things up
        append_new = table.list_new_ids.append
        append_articles = table.list_id_articles.append

        req_str = self.refineBaseQuery(table.base_query, table.topic_entries, table.author_entries)
        print(req_str)
        count_query.exec_(req_str)

        start_time = datetime.datetime.now()
        i = 0

        while count_query.next():
            i += 1
            record = count_query.record()

            append_articles(record.value('id'))

            if record.value('new') == 1:
                append_new(record.value('id'))

        print(datetime.datetime.now() - start_time)
        print("Nbr of entries processed: {}".format(i))

让我们假设这个循环有大约400个要处理的条目。它需要大约一秒钟,我认为它太长了。我试图尽可能地优化这个过程,但它仍然需要花费太多时间。

以下是以前的方法通常打印的内容:

SELECT * FROM papers WHERE id IN(320, 1320, 5648, 17589, 20092, 20990, 49439, 58378, 65251, 68772, 73509, 86859, 90594)
0:00:00.001403
Nbr of entries processed: 13
SELECT * FROM papers WHERE topic_simple LIKE '% 3D print%'
0:00:00.591745
Nbr of entries processed: 81
SELECT * FROM papers WHERE id IN (5648, 11903, 14258, 30587, 40339, 55691, 57383, 58378, 62951, 65251, 68772, 87295)
0:00:00.000478
Nbr of entries processed: 12
SELECT * FROM papers WHERE topic_simple LIKE '% Python %'
0:00:00.596490
Nbr of entries processed: 9
SELECT * FROM papers WHERE topic_simple LIKE '% Raspberry pi %' OR topic_simple LIKE '% arduino %'
0:00:00.988276
Nbr of entries processed: 5
SELECT * FROM papers WHERE topic_simple LIKE '% sensor array%' OR topic_simple LIKE '% biosensor %'
0:00:00.996164
Nbr of entries processed: 433
SELECT * FROM papers WHERE id IN (320, 540, 1320, 1434, 1860, 4527, 5989, 6022, 6725, 6978, 7268, 8625, 9410, 9814, 9850, 10608, 13219, 15572, 15794, 19345, 19674, 19899, 20990, 22530, 26443, 26535, 28721, 29089, 30923, 31145, 31458, 31598, 32069, 34129, 35820, 36142, 36435, 37546, 39188, 39952, 40949, 41764, 43529, 43610, 44184, 45206, 49210, 49807, 50279, 50943, 51536, 51549, 52921, 52967, 54610, 56036, 58087, 60490, 62133, 63051, 63480, 63535, 64861, 66906, 68107, 68328, 69021, 71797, 73058, 74974, 75331, 77697, 78138, 80152, 80539, 82172, 82370, 82840, 86859, 87467, 91528, 92167)
0:00:00.002891
Nbr of entries processed: 82
SELECT * FROM papers WHERE id IN (7043, 41643, 44688, 50447, 64723, 72601, 81006, 82380, 84285)
0:00:00.000348
Nbr of entries processed: 9

这是更好的方法吗?我可以获得更好的结果吗?

注意:显示的时间是运行循环所需的时间,而不是运行查询所需的时间。

我试过count_query.setForwardOnly(True),如文档中所述,但它对性能没有影响。

编辑: 这是包含约600个条目的测试数据库: database

1 个答案:

答案 0 :(得分:1)

显然我无法对此进行测试,因此我不知道它是否会产生重大影响,但您可以尝试使用基于索引的查找:

id_index = count_query.record().indexOf('id')
new_index = count_query.record().indexOf('new')
while count_query.next():
    record = count_query.record()
    id_value = record.value(id_index)
    append_articles(id_value)
    if record.value(new_index) == 1:
        append_new(id_value)

<强>更新

使用您的示例数据库,我无法重现您所看到的问题,而且我发现上面的方法速度大约是原始方法的两倍。这是一些示例输出:

IDs: 660, Articles: 666
IDs: 660, Articles: 666
IDs: 660, Articles: 666
test(index=False): 0.19050272400090762
IDs: 660, Articles: 666
IDs: 660, Articles: 666
IDs: 660, Articles: 666
test(index=True): 0.09384496400161879

测试用例:

import sys, os, timeit
from PyQt4 import QtCore, QtGui
from PyQt4.QtSql import QSqlDatabase, QSqlQuery

def test(index=False):
    count_query = QSqlQuery('select * from results')
    list_new_ids = []
    list_id_articles = []
    append_new = list_new_ids.append
    append_articles = list_id_articles.append
    if index:
        id_index = count_query.record().indexOf('id')
        new_index = count_query.record().indexOf('new')
        while count_query.next():
            record = count_query.record()
            id_value = record.value(id_index)
            append_articles(id_value)
            if record.value(new_index) == 1:
                append_new(id_value)
    else:
        while count_query.next():
            record = count_query.record()
            append_articles(record.value('id'))
            if record.value('new') == 1:
                append_new(record.value('id'))
    print('IDs: %d, Articles: %d' % (
        len(list_new_ids), len(list_id_articles)))

class Window(QtGui.QWidget):
    def __init__(self):
        super(Window, self).__init__()
        self.button = QtGui.QPushButton('Test', self)
        self.button.clicked.connect(self.handleButton)
        layout = QtGui.QVBoxLayout(self)
        layout.addWidget(self.button)
        self.database = QSqlDatabase.addDatabase("QSQLITE")
        path = os.path.join(os.path.dirname(__file__), 'tmp/perf-test.db')
        self.database.setDatabaseName(path)
        self.database.open()

    def handleButton(self):
        for stmt in 'test(index=False)', 'test(index=True)':
            print('%s: %s' % (stmt, timeit.timeit(
                stmt, 'from __main__ import test', number=3)))

if __name__ == '__main__':

    import sys
    app = QtGui.QApplication(sys.argv)
    window = Window()
    window.setGeometry(600, 300, 200, 100)
    window.show()
    sys.exit(app.exec_())