我可以假设我的线程在threading.active_count()返回1时完成了吗?

时间:2016-11-23 14:18:14

标签: python multithreading python-multithreading

鉴于以下课程:

from abc import ABCMeta, abstractmethod
from time import sleep
import threading
from threading import active_count, Thread

class ScraperPool(metaclass=ABCMeta):
    Queue = []
    ResultList = []

    def __init__(self, Queue, MaxNumWorkers=0, ItemsPerWorker=50):
        # Initialize attributes
        self.MaxNumWorkers = MaxNumWorkers
        self.ItemsPerWorker = ItemsPerWorker
        self.Queue = Queue # For testing purposes.

    def initWorkerPool(self, PrintIDs=True):
        for w in range(self.NumWorkers()):
            Thread(target=self.worker, args=(w + 1, PrintIDs,)).start()
            sleep(1) # Explicitly wait one second for this worker to start.

    def run(self):
        self.initWorkerPool()

        # Wait until all workers (i.e. threads) are done.
        while active_count() > 1:
            print("Active threads: " + str(active_count()))
            sleep(5)

        self.HandleResults()

    def worker(self, id, printID):
        if printID:
            print("Starting worker " + str(id) + ".")

        while (len(self.Queue) > 0):
            self.scraperMethod()

        if printID:
            print("Worker " + str(id) + " is quiting.")

        # Todo Kill is this Thread.

        return

    def NumWorkers(self):
        return 1 # Simplified for testing purposes.

    @abstractmethod
    def scraperMethod(self): 
        pass

class TestScraper(ScraperPool):
    def scraperMethod(self):
        # print("I am scraping.")
        # print("Scraping. Threads#: " + str(active_count()))
        temp_item = self.Queue[-1]
        self.Queue.pop()

        self.ResultList.append(temp_item)

    def HandleResults(self):
        print(self.ResultList)

ScraperPool.register(TestScraper)

scraper = TestScraper(Queue=["Jaap", "Piet"])
scraper.run()
print(threading.active_count())
# print(scraper.ResultList)

当所有线程都完成后,仍有一个活动线程 - 最后一行的threading.active_count()为我提供该号码。

活动帖子为<_MainThread(MainThread, started 12960)> - 打印为threading.enumerate()

我可以假设active_count() == 1时我的所有线程都已完成吗? 或者,例如,导入的模块可以启动其他线程,以便我的线程实际在active_count() > 1时完成 - 也是我在run方法中使用的循环的条件。

2 个答案:

答案 0 :(得分:2)

根据docs active_count()包含主线索,因此,如果你在1,那么你最有可能完成,但如果你有另一个新线程来源程序然后你可能会在active_count()命中之前完成。

我建议您在join上实施明确的ScraperPool方法并跟踪您的工作人员,并在需要时明确地将他们加入主线程,而不是检查您是否已完成{{1}调用。

还要记住GIL ......

答案 1 :(得分:2)

您可以假设您的线程在active_count()达到1时完成。问题是,如果任何其他模块创建了一个线程,您将永远不会达到1.您应该明确管理您的线程。 / p>

示例:您可以将线程放在列表中,并一次加入一个。您的代码的相关更改包括:

def __init__(self, Queue, MaxNumWorkers=0, ItemsPerWorker=50):
    # Initialize attributes
    self.MaxNumWorkers = MaxNumWorkers
    self.ItemsPerWorker = ItemsPerWorker
    self.Queue = Queue # For testing purposes.
    self.WorkerThreads = []

def initWorkerPool(self, PrintIDs=True):
    for w in range(self.NumWorkers()):
        thread = Thread(target=self.worker, args=(w + 1, PrintIDs,))
        self.WorkerThreads.append(thread)
        thread.start()
        sleep(1) # Explicitly wait one second for this worker to start.

def run(self):
    self.initWorkerPool()

    # Wait until all workers (i.e. threads) are done. Waiting in order
    # so some threads further in the list may finish first, but we
    # will get to all of them eventually
    while self.WorkerThreads:
        self.WorkerThreads[0].join()

    self.HandleResults()