Python队列和线程模块 - 强加一个额外的自定义锁?

时间:2013-12-01 05:10:01

标签: python multithreading

使用Linux和Python 2.7.6,我有一个脚本可以一次上传大量文件。我正在使用队列和线程模块的多线程。

我有一个对象可以跟踪已成功上传的文件,并在每次成功上传后减少。我需要使这个操作原子/线程安全。由于Queue模块是高级别的并且在较低级别拥有它自己的互斥锁,除此之外我还可以强加自己的锁定/获取吗?我尝试这样做并且没有错误(在file_quantity.deduct()所在的最后一个代码块的底部)。但我不确定它是否真正起作用。以下是可读性的缩短版本:

class FileQuantity(object):
    """Keeps track of files that have been uploaded and how many are left"""

    def __init__(self, file_quantity):
        self.quantity = file_quantity
        self.total = file_quantity

    def deduct(self):
        self.quantity -= 1

kill_received = False
lock = threading.Lock()

class CustomQueue(Queue.Queue):
    #Can not use .join() because it would block any processing
    #for SIGINT untill threads are done. To counter this,
    # wait() is given a time out along with while not kill_received
    #to be checked

    def join(self):
        self.all_tasks_done.acquire()
        try:
            while not kill_received and self.unfinished_tasks:
            self.all_tasks_done.wait(10.0)
        finally:
            self.all_tasks_done.release()


def do_the_uploads(file_list, file_quantity,
    retry_list, authenticate):
    """The uploading engine"""
    value = raw_input(
        "\nPlease enter how many concurent "
        "uploads you want at one time(example: 200)> ")
    value = int(value)
    logger.info('{} concurent uploads will be used.'.format(value))

    confirm = raw_input(
        "\nProceed to upload files? Enter [Y/y] for yes: ").upper()
    if confirm == "Y":
        kill_received = False
        sys.stdout.write("\x1b[2J\x1b[H")
        q = CustomQueue()

        def worker():
        global kill_received
        while not kill_received:
                item = q.get()
                upload_file(item, file_quantity, retry_list, authenticate, q)
                q.task_done()

        for i in range(value):
            t = Thread(target=worker)
            t.setDaemon(True)
            t.start()

        for item in file_list:
            q.put(item)

        q.join()

        print "Finished. Cleaning up processes...",
        #Allowing the threads to cleanup
        time.sleep(4)
        print "done."


def upload_file(file_obj, file_quantity, retry_list, authenticate, q):
    """Uploads a file. One file per it's own thread. No batch style. This way if one upload
       fails no others are effected."""
    absolute_path_filename, filename, dir_name, token, url = file_obj
    url = url + dir_name + '/' + filename
    try:
        with open(absolute_path_filename) as f:  
           r = requests.put(url, data=f, headers=header_collection, timeout=20)
    except requests.exceptions.ConnectionError as e:
        pass

    if src_md5 == r.headers['etag']:
       lock.acquire()
       file_quantity.deduct()
       lock.release()

1 个答案:

答案 0 :(得分:1)

嗯,你发布的代码并没有在任何地方定义lock,所以很难肯定。保护实际需要保护的代码更为常见:

def deduct(self):
    with lock:
        self.quantity -= 1

Sanest是在中分配一个锁所需的结构,如下所示:

class FileQuantity(object):
    """Keeps track of files that have been uploaded and how many are left"""

    def __init__(self, file_quantity):
        self.quantity = file_quantity
        self.total = file_quantity
        self.lock = threading.Lock()

    def deduct(self):
        with self.lock:
            self.quantity -= 1

并对self.lock数据成员的任何其他突变类似地使用FileQuantity,这些突变可以由多个线程同时调用。