PostgreSQL查询通过Python花了太长时间

时间:2017-09-28 12:10:11

标签: python sql postgresql rabbitmq pika

 $stmt = $odb->prepare("DELETE FROM Account WHERE id=? ");
 $stmt->execute(array($_GET['DeleteID']));

#!/usr/bin/env python import pika def doQuery( conn, i ) : cur = conn.cursor() cur.execute("SELECT * FROM table OFFSET %s LIMIT 100000", (i,)) return cur.fetchall() print "Using psycopg2" import psycopg2 myConnection = psycopg2.connect( host=hostname, user=username, password=password, dbname=database ) connection = pika.BlockingConnection(pika.ConnectionParameters(host='localhost')) channel = connection.channel() channel.queue_declare(queue='task_queue2') endloop = False i = 1 while True: results = doQuery( myConnection, i ) j = 0 while j < 10000: try: results[j][-1] except: endloop = True break message = str(results[j][-1]).encode("hex") channel.basic_publish(exchange='', routing_key='task_queue2', body=message #properties=pika.BasicProperties( #delivery_mode = 2, # make message persistent )#) j = j + 1 # if i % 10000 == 0: # print i if endloop == False: break i = i + 10000 达到100,000,000时,SQL查询执行时间太长,但我需要将大约两个十亿条目放入队列中。任何人都知道我可以运行一个更高效的SQL查询,以便我可以更快地将所有这20亿个数据放入队列中吗?

1 个答案:

答案 0 :(得分:0)

psycopg2支持server-side cursors,即在数据库服务器而不是客户端上管理的游标。完整的结果集不会一次性传输到客户端,而是根据需要通过光标界面传送给它。

这将允许您在不使用分页的情况下执行查询(如LIMIT / OFFSET实现),并将简化您的代码。要使用服务器端游标,请在创建游标时使用name参数。

import pika
import psycopg2

with psycopg2.connect(host=hostname, user=username, password=password, dbname=database) as conn:
    with conn.cursor(name='my_cursor') as cur:    # create a named server-side cursor
        cur.execute('select * from table')

        connection = pika.BlockingConnection(pika.ConnectionParameters(host='localhost'))
        channel = connection.channel()
        channel.queue_declare(queue='task_queue2')

        for row in cur:
            message = str(row[-1]).encode('hex')
            channel.basic_publish(exchange='', routing_key='task_queue2', body=message)

如果需要,您可能需要调整cur.itersize以提高效果。