由多个池工作人员打印到文件

时间:2015-10-30 13:18:50

标签: python multithreading concurrency threadpool

我试图创建一个将数据打印到文件中的池。

def get_and_print_something(url):

    with open('file.txt','a') as f:
        f.write(get_line(url))

pool = Pool(50)

for url in urls:
    pool.apply_async(get_something, args=(url,))

问题在于有时会写错了数据。这是因为两个工人同时操作同一个文件。是否可以等待文件被修改?

txt示例:

This is a correct line.
This is a correct line. 
orrect line.
This is a correct line.
...

1 个答案:

答案 0 :(得分:0)

你可以从例如这个网站:

http://effbot.org/zone/thread-synchronization.htm#locks

https://pymotw.com/2/threading/

基本归结为:

import threading

lock = threading.Lock()

def get_and_print_something(url):

    # Not yet in critical section because we want this to happen concurrently:
    line = get_line(url) 

    lock.acquire() # Will wait if necessary until any other thread has finished its file access.

    # In critical section now. Only one thread may run this at any one time.

    try:
        with open('file.txt','a') as f:
            f.write( line )
    finally:
        lock.release() # Release lock, so that other threads can access the file again.