Question

我升级我的代码以使用ThreadPoolExecuter并希望能够超时需要超过几秒钟才能处理的任何线程。是否可以在作为线程池一部分的线程上强制超时？我正在使用的代码如下。

    with concurrent.futures.ThreadPoolExecutor(max_workers=16) as executor:
        future_tasks = {executor.submit(self.crawl_task, url): url for url in self.results.keys()}

        for future in concurrent.futures.as_completed(future_tasks):
            url = future_tasks[future]
            try:
                result = future.result()
                self.results[result[0]] = result[1]
            except Exception as e:
                print('%r generated an exception: %s' % (url, e))

我能够超时线程的唯一方法是更改

for future in concurrent.futures.as_completed(future_tasks):

到

for future in concurrent.futures.as_completed(future_tasks, timeout=1):

然而，这将打破整个循环，我无法知道哪个线程超时以及哪些数据导致超时。

Traceback (most recent call last):
  File "test.py", line 75, in <module>
    request = Requests(data)
  File "test.py", line 22, in __init__
    for future in concurrent.futures.as_completed(future_tasks, timeout=1):
  File "/source/homebrew/Cellar/python3/3.4.0_1/Frameworks/Python.framework/Versions/3.4/    lib/python3.4/concurrent/futures/_base.py", line 213, in as_completed
    len(pending), len(fs)))
concurrent.futures._base.TimeoutError: 17 (of 17) futures unfinished

Answer 1

在异常中包装期货的整个for循环仍然允许其他线程结果进行处理。使用两个单独的词典，您可以看到由于超时而停止的线程。

with concurrent.futures.ThreadPoolExecutor(max_workers=16) as executor:
    future_tasks = {executor.submit(self.crawl_task, url): url for url in self.requests.keys()}

    try:
        for future in concurrent.futures.as_completed(future_tasks, timeout=10):
            result = future.result()
            self.responses[result[0]] = result[1]
    except Exception as e:
        print(e)

timeout = [url for url in self.requests.keys() if url not in self.responses.keys()]

print('URL Threads timed out: ', timeout)

我必须指出，这违背了传统观念。通常，如果在异常中包装整个for循环，则循环中的异常之后的任何内容都不应该处理，但是期货的魔力似乎允许循环中的所有内容（除了超时的线程）处理。

Answer 2

执行此操作的一种方法是在self.crawl_task中执行开始时将url记录在文件中。在线程任务完成之前，它可以附加一个字符串＆＃34; DONE＆＃34;也许带有时间戳。

此外，您需要处理该TimeoutError异常，以免执行中断。如果超时，您可以查看没有＆＃34; DONE＆＃34;的文件日志。字符串在里面。

使用ThreadPoolExecutor强制线程超时

2 个答案: