我正在使用OpenJDK 64位1.7.0_55 JVM在Ubuntu 12.04上运行Jython 2.5.3。
我正在尝试创建一个简单的线程应用程序来优化数据处理和加载。我有populator线程从数据库读取记录并在将它们放入队列之前将它们稍微破坏。消费者线程读取队列,将数据存储在不同的数据库中。以下是我的代码大纲:
import sys
import time
import threading
import Queue
class PopulatorThread(threading.Thread):
def __init__(self, mod, mods, queue):
super(PopulatorThread, self).__init__()
self.mod = mod
self.mods = mods
self.queue = queue
def run(self):
# Create db connection
# ...
try:
# Select one segment of records using 'id % mods = mod'
# Process these records & slap them onto the queue
# ...
except:
con.rollback()
raise
finally:
print "Made it to 'finally' in populator %d" % self.mod
con.close()
class ConsumerThread(threading.Thread):
def __init__(self, mod, queue):
super(ConsumerThread, self).__init__()
self.mod = mod
self.queue = queue
def run(self):
# Create db connection
# ...
try:
while True:
item = queue.get()
if not item: break
# Put records from the queue into
# a different database
# ...
queue.task_done()
except:
con.rollback()
raise
finally:
print "Made it to 'finally' in consumer %d" % self.mod
con.close()
def main(argv):
tread1Count = 3
tread2Count = 4
# This is the notefactsselector data queue
nfsQueue = Queue.Queue()
# Start consumer/writer threads
j = 0
treads2 = []
while j < tread2Count:
treads2.append(ConsumerThread(j, nfsQueue))
treads2[-1].start()
j += 1
# Start reader/populator threads
i = 0
treads1 = []
while i < tread1Count:
treads1.append(PopulatorThread(i, tread1Count, nfsQueue))
treads1[-1].start()
i += 1
# Wait for reader/populator threads
print "Waiting to join %d populator threads" % len(treads1)
i = 0
for tread in treads1:
print "Waiting to join a populator thread %d" % i
tread.join()
i += 1
#Add one sentinel value to queue for each write thread
print "Adding sentinel values to end of queue"
for tread in treads2:
nfsQueue.put(None)
# Wait for consumer/writer threads
print "Waiting to join consumer/writer threads"
for tread in treads2:
print "Waiting on a consumer/writer"
tread.join()
# Wait for Queue
print "Waiting to join queue with %d items" % nfsQueue.qsize()
nfsQueue.join()
print "Queue has been joined"
if __name__ == '__main__':
main(sys.argv)
我已经简化了数据库实现以节省空间。
我怀疑我的线程初始化或线程连接的顺序错误,但是我对并发编程的经验不多,所以我对如何做事的直觉并没有很好地发展。我发现很多Python / Jython队列的例子都是由while循环填充并由线程读取,但到目前为止还没有关于由一组线程填充并由另一组线程读取的队列。
populator和使用者线程似乎已完成。
程序似乎阻塞了最终等待Queue对象终止。
感谢任何为我提供建议和教训的人!
答案 0 :(得分:1)
当您完成处理后,您是否在队列中的每个项目上调用task_done()?如果您没有明确告诉队列每个任务都已完成,那么它将永远不会从join()返回。
PS:您没有看到“等待加入populator线程%d”,因为您忘记了它前面的打印:)