使用Linux和Python 2.7.6,我有一个脚本可以一次上传大量文件。我正在使用队列和线程模块的多线程。
我有一个对象可以跟踪已成功上传的文件,并在每次成功上传后减少。我需要使这个操作原子/线程安全。由于Queue模块是高级别的并且在较低级别拥有它自己的互斥锁,除此之外我还可以强加自己的锁定/获取吗?我尝试这样做并且没有错误(在file_quantity.deduct()
所在的最后一个代码块的底部)。但我不确定它是否真正起作用。以下是可读性的缩短版本:
class FileQuantity(object):
"""Keeps track of files that have been uploaded and how many are left"""
def __init__(self, file_quantity):
self.quantity = file_quantity
self.total = file_quantity
def deduct(self):
self.quantity -= 1
kill_received = False
lock = threading.Lock()
class CustomQueue(Queue.Queue):
#Can not use .join() because it would block any processing
#for SIGINT untill threads are done. To counter this,
# wait() is given a time out along with while not kill_received
#to be checked
def join(self):
self.all_tasks_done.acquire()
try:
while not kill_received and self.unfinished_tasks:
self.all_tasks_done.wait(10.0)
finally:
self.all_tasks_done.release()
def do_the_uploads(file_list, file_quantity,
retry_list, authenticate):
"""The uploading engine"""
value = raw_input(
"\nPlease enter how many concurent "
"uploads you want at one time(example: 200)> ")
value = int(value)
logger.info('{} concurent uploads will be used.'.format(value))
confirm = raw_input(
"\nProceed to upload files? Enter [Y/y] for yes: ").upper()
if confirm == "Y":
kill_received = False
sys.stdout.write("\x1b[2J\x1b[H")
q = CustomQueue()
def worker():
global kill_received
while not kill_received:
item = q.get()
upload_file(item, file_quantity, retry_list, authenticate, q)
q.task_done()
for i in range(value):
t = Thread(target=worker)
t.setDaemon(True)
t.start()
for item in file_list:
q.put(item)
q.join()
print "Finished. Cleaning up processes...",
#Allowing the threads to cleanup
time.sleep(4)
print "done."
def upload_file(file_obj, file_quantity, retry_list, authenticate, q):
"""Uploads a file. One file per it's own thread. No batch style. This way if one upload
fails no others are effected."""
absolute_path_filename, filename, dir_name, token, url = file_obj
url = url + dir_name + '/' + filename
try:
with open(absolute_path_filename) as f:
r = requests.put(url, data=f, headers=header_collection, timeout=20)
except requests.exceptions.ConnectionError as e:
pass
if src_md5 == r.headers['etag']:
lock.acquire()
file_quantity.deduct()
lock.release()
答案 0 :(得分:1)
嗯,你发布的代码并没有在任何地方定义lock
,所以很难肯定。保护实际需要保护的代码更为常见:
def deduct(self):
with lock:
self.quantity -= 1
Sanest是在中分配一个锁所需的结构,如下所示:
class FileQuantity(object):
"""Keeps track of files that have been uploaded and how many are left"""
def __init__(self, file_quantity):
self.quantity = file_quantity
self.total = file_quantity
self.lock = threading.Lock()
def deduct(self):
with self.lock:
self.quantity -= 1
并对self.lock
数据成员的任何其他突变类似地使用FileQuantity
,这些突变可以由多个线程同时调用。