Question

我正在处理Jira更改日志历史数据，并且由于大量数据以及大多数处理时间都是基于I / O的事实，我认为异步方法可能运行良好。

我有一个所有issue_id的列表，我正在通过jira-python api提供请求的函数，将信息提取到dict，并且然后通过传递DictWriter将其写出来。为了使它成为线程安全的，我从Lock()模块导入了threading，我也传入了它。在测试时，它似乎在某个点上陷入僵局并且只是挂起。我在文档中注意到它说如果任务彼此依赖，那么它们就可以挂起，我想它们是由于我正在实现的锁定。我怎样才能防止这种情况发生？

以下是我的参考代码：

（在代码的这一点上有一个名为keys的列表，其中包含所有的issue_id）

def write_issue_history(
        jira_instance: JIRA,
        issue_id: str,
        writer: DictWriter,
        lock: Lock):
    logging.debug('Now processing data for issue {}'.format(issue_id))
    changelog = jira_instance.issue(issue_id, expand='changelog').changelog

    for history in changelog.histories:
        created = history.created
        for item in history.items:
            to_write = dict(issue_id=issue_id)
            to_write['date'] = created
            to_write['field'] = item.field
            to_write['changed_from'] = item.fromString
            to_write['changed_to'] = item.toString
            clean_data(to_write)
            add_etl_fields(to_write)
            print(to_write)
            with lock:
                print('Lock obtained')
                writer.writerow(to_write)

if __name__ == '__main__':
    with open('outfile.txt', 'w') as outf:
                writer = DictWriter(
                    f=outf,
                    fieldnames=fieldnames,
                    delimiter='|',
                    extrasaction='ignore'
                )
                writer_lock = Lock()
                with ThreadPoolExecutor(max_workers=5) as exec:
                    for key in keys[:5]:
                        exec.submit(
                            write_issue_history,
                            j,
                            key,
                            writer,
                            writer_lock
                        )

编辑：我也很可能受到Jira API的限制。

Answer 1

您需要将exec的结果存储到列表中，通常命名为futs，然后循环遍历该列表，调用result()以获取其结果，处理可能存在的任何错误发生了。

（我也有机会exec到executor，因为这更传统，它可以避免覆盖内置的内容。

from traceback import print_exc

...

with ThreadPoolExecutor(max_workers=5) as executor:
    futs = []
    for key in keys[:5]:
        futs.append( executor.submit(
            write_issue_history,
            j,
            key,
            writer,
            writer_lock)
        )

for fut in futs:
    try:
        fut.result()
    except Exception as e:
        print_exc()

如何使用锁而不会在concurrent.futures.ThreadPoolExecutor中导致死锁？

1 个答案: