python - for循环中的多线程

时间:2015-02-16 02:14:46

标签: python multithreading

这段代码运行正常,然后它给了我这个错误:

thread.error: can't start new thread

我做错了什么?名称文件大约有10,000个名字,电子邮件文件大约是5封电子邮件。

for x in open(names):
    name = x.strip()


    def check(q):
        while True:
            email = q.get()
            lock.acquire()
            print email, name, threading.active_count()
            lock.release()

            #Do things in 
            #the internet

            q.task_done()
        return

    for i in range(threads):
            t = threading.Thread(target=check, args=(q,))
            t.setDaemon(True)
            t.start()

    for word in open(emails):
            q.put(word.strip())

    q.join()

我只指定了2个线程,但当active_count大约为890时,它最终会产生数百个崩溃。我该如何解决这个问题?

2 个答案:

答案 0 :(得分:1)

以下是使用semaphore object

的略微修改版本
import threading
import Queue

NUM_THREADS = 2 # you can change this if you want

semaphore = threading.Semaphore(NUM_THREADS)

threads = NUM_THREADS

running_threads = []

lock = threading.Lock()

q = Queue.Queue()

# moved the check function out of the loop
def check(name, q, s):
    # acquire the semaphore
    with s:
        not_empty = True

        while not_empty:

            try:
                email = q.get(False) # we are passing false so it won't block.
            except Queue.Empty, e:
                not_empty = False
                break

            lock.acquire()

            print email, name, threading.active_count()

            lock.release()

            # additional work ...

            q.task_done()

for x in open(names):
    name = x.strip()

    for word in open(emails):
        q.put(word.strip())

    for i in range(threads):
            t = threading.Thread(target=check, args=(name, q, semaphore))
            # t.setDaemon(True) # we are not setting the damenon flag
            t.start()

            running_threads.append(t)

    # joining threads (we need this if the daemon flag is false)
    for t in running_threads:
        t.join()

    # joining queue (Probably won't need this if the daemon flag is false)
    q.join()

答案 1 :(得分:0)

您可以使用线程池简化代码:

from contextlib import closing
from itertools import product
from multiprocessing.dummy import Pool # thread pool

def foo(arg):
    name, email = map(str.strip, arg)
    try:
        # "do things in the internet"
    except Exception as e:
        return (name, email), None, str(e)
    else:
        return (name, email), result, None

with open(names_filename) as names_file, \
     open(emails_filename) as emails_file, \
     closing(Pool(max_threads_count)) as pool:
    args = product(names_file, emails_file)
    it = pool.imap_unordered(foo, args, chunksize=100)
    for (name, email), result, error in it:
        if error is not None:
            print("Failed to foo {} {}: {}".format(name, email, error))