在查看google以及stackoverflow和其他网站的帖子之后,我仍然对如何在我的代码上应用队列和线程感到困惑:
import psycopg2
import sys
import re
# for threading and queue
import multiprocessing
from multiprocessing import Queue
# for threading and queue
import time
from datetime import datetime
class Database_connection():
def db_call(self,query,dbHost,dbName,dbUser,dbPass):
try:
con = None
con = psycopg2.connect(host=dbHost,database=dbName,
user=dbUser,password=dbPass)
cur = con.cursor()
cur.execute(query)
data = cur.fetchall()
resultList = []
for data_out in data:
resultList.append(data_out)
return resultList
except psycopg2.DatabaseError, e:
print 'Error %s' % e
sys.exit(1)
finally:
if con:
con.close()
w = Database_connection()
sql = "select stars from galaxy"
startTime = datetime.now()
for result in w.db_call(sql, "x", "x", "x", "x"):
print result[0]
print "Runtime: " + str(datetime.now()-startTime)
让supose结果将是100+值。我怎样才能将这些100多个结果放在队列中并使用队列和多处理模块执行(例如打印),然后执行5个?
答案 0 :(得分:0)
您希望此代码执行什么操作?
此代码没有输出,因为get()
返回队列中的下一个项目(doc)。您将sql响应中的字母一次放入一个字母的队列中。 i
中的for i...
循环遍历w.db_call
返回的列表。这些项目是(我假设)字符串,然后您将迭代并一次向queue
添加一个字符串。接下来要做的是从队列中删除刚刚添加到队列中的元素,这会使队列在每次循环中保持不变。如果在循环中放入print
语句,则会打印出刚从队列中获取的字母。
Queue
用于在进程之间传递信息。我认为你正在尝试建立一个生产者/消费者模式,你有一个进程向队列添加东西,以及多个其他进程从队列中消耗东西。请参阅working example of multiprocessing.Queue及其中包含的链接(example,main documentation)。
只要您不需要在交互式shell中运行,最简单的方法就是使用Pool
(几乎逐字逐句地提升{/ 3}}
from multiprocessing import Pool
p = Pool(5) # sets the number of worker threads you want
def f(res):
# put what ever you want to do with each of the query results in here
return res
result_lst = w.db_call(sql, "x", "x", "x", "x")
proced_results = p.map(f, result_lst)
将您想要做的事情应用于每个结果(写入函数f
)并将该操作的结果作为列表返回。要使用的子进程数由Pool
的参数设置。
答案 1 :(得分:0)
这是我的建议......
import Queue
from threading import Thread
class Database_connection:
def db_call(self,query,dbHost,dbName,dbUser,dbPass):
# your code here
return
# in this example each thread will execute this function
def processFtpAddrMt(queue):
# loop will continue until queue containing FTP addresses is empty
while True:
# get an ftp address, a exception will be called when the
# queue is empty and the loop will break
try: ftp_addr = queue.get()
except: break
# put code to process the ftp address here
# let queue know this task is done
queue.task_done()
w = Database_connection()
sql = "select stars from galaxy"
ftp_addresses = w.db_call(sql, "x", "x", "x", "x")
# put each result of the SQL call in a Queue class
ftp_addr_queue = Queue.Queue()
for addr in ftp_addresses:
ftp_addr_queue.put(addr)
# create five threads where each one will run analyzeFtpResult
# pass the queue to the analyzeFtpResult function
for x in range(0,5):
t = Thread(target=processFtpAddrMt,args=(ftp_addr_queue,))
t.setDaemon(True)
t.start()
# blocks further execution of the script until all queue items have been processed
ftp_addr_queue.join()
它使用Queue类存储SQL结果,然后使用Thread类来处理队列。创建了五个线程类,每个类使用一个processFtpAddrMt函数,该函数从队列中获取ftp地址,直到队列为空。您所要做的就是添加用于处理ftp地址的代码。希望这会有所帮助。
答案 2 :(得分:-1)
我能够通过以下方式解决问题:
def worker():
w = Database_connection()
sql = "select stars from galaxy"
for result in w.db_call(sql, "x", "x", "x", "x"):
if result:
jobs = []
startTime = datetime.now()
for i in range(1):
p = multiprocessing.Process(target=worker)
jobs.append(p)
p.start()
print "Runtime: " + str(datetime.now()-startTime)
我相信这不是最好的方法,但现在解决了我的问题:)