我试图创建一个将数据打印到文件中的池。
def get_and_print_something(url):
with open('file.txt','a') as f:
f.write(get_line(url))
pool = Pool(50)
for url in urls:
pool.apply_async(get_something, args=(url,))
问题在于有时会写错了数据。这是因为两个工人同时操作同一个文件。是否可以等待文件被修改?
txt示例:
This is a correct line.
This is a correct line.
orrect line.
This is a correct line.
...
答案 0 :(得分:0)
你可以从例如这个网站:
http://effbot.org/zone/thread-synchronization.htm#locks或
https://pymotw.com/2/threading/
基本归结为:
import threading
lock = threading.Lock()
def get_and_print_something(url):
# Not yet in critical section because we want this to happen concurrently:
line = get_line(url)
lock.acquire() # Will wait if necessary until any other thread has finished its file access.
# In critical section now. Only one thread may run this at any one time.
try:
with open('file.txt','a') as f:
f.write( line )
finally:
lock.release() # Release lock, so that other threads can access the file again.