突发事件sys.exit()上的Python守护程序线程清理逻辑

时间:2013-12-02 00:04:53

标签: python linux multithreading

使用Linux和Python 2.7.6,我有一个脚本可以一次上传大量文件。我正在使用队列和线程模块的多线程。

如果用户点击ctrl-C,我为SIGINT实现了一个处理程序来停止脚本。我更喜欢使用守护程序线程,所以我不必清除队列,这需要很多重写代码,以使SIGINT处理程序可以访问Queue对象,因为处理程序不接受参数。

为了确保守护程序线程在sys.exit()之前完成并清理,我使用threading.Event()和threading.clear()来使线程等待。这段代码似乎可以作为print threading.enumerate()只显示在我调试时脚本终止之前的主线程。为了确保这一点,我想知道是否对这个清理实现有任何见解我可能会丢失,即使它似乎对我有用:

def signal_handler(signal, frame):
    global kill_received
    kill_received = True
    msg = (
         "\n\nYou pressed Ctrl+C!"
         "\nYour logs and their locations are:"
         "\n{}\n{}\n{}\n\n".format(debug, error, info))
    logger.info(msg)
    threads = threading.Event()
    threads.clear()

    while True:
        time.sleep(3)
        threads_remaining = len(threading.enumerate())
        print threads_remaining
        if threads_remaining == 1:
            sys.exit()

def do_the_uploads(file_list, file_quantity,
        retry_list, authenticate):
    """The uploading engine"""
    value = raw_input(
        "\nPlease enter how many concurent "
        "uploads you want at one time(example: 200)> ")
    value = int(value)
    logger.info('{} concurent uploads will be used.'.format(value))

    confirm = raw_input(
        "\nProceed to upload files? Enter [Y/y] for yes: ").upper()
    if confirm == "Y":
        kill_received = False
        sys.stdout.write("\x1b[2J\x1b[H")
        q = CustomQueue()

        def worker():
            global kill_received
            while not kill_received:
                item = q.get()
                upload_file(item, file_quantity, retry_list, authenticate, q)
                q.task_done()

        for i in range(value):
            t = Thread(target=worker)
            t.setDaemon(True)
            t.start()

        for item in file_list:
            q.put(item)

        q.join()

        print "Finished. Cleaning up processes...",
        #Allowing the threads to cleanup
        time.sleep(4)



def upload_file(file_obj, file_quantity, retry_list, authenticate, q):
    """Uploads a file. One file per it's own thread. No batch style. This way if one upload
       fails no others are effected."""
    absolute_path_filename, filename, dir_name, token, url = file_obj
    url = url + dir_name + '/' + filename
    try:
        with open(absolute_path_filename) as f:  
           r = requests.put(url, data=f, headers=header_collection, timeout=20)
    except requests.exceptions.ConnectionError as e:
        pass
    if src_md5 == r.headers['etag']:
       file_quantity.deduct()

1 个答案:

答案 0 :(得分:4)

如果你想处理Ctrl+C;它足以在主线程中处理KeyboardInterrupt异常。除非您在其中执行global X,否则请勿在函数中使用X = some_value。使用time.sleep(4)来清除线程是一种代码味道。你不需要它。

  

我正在使用threading.Event()和threading.clear()来使线程等待。

此代码对您的主题无影响

# create local variable
threads = threading.Event()
# clear internal flag in it (that is returned by .is_set/.wait methods)
threads.clear()

不要从多线程程序中的信号处理程序中调用logger.info()。它可能使你的程序陷入僵局。只能从信号处理程序调用一组有限的函数。安全选项是在其中设置一个全局标志并退出:

def signal_handler(signal, frame):
    global kill_received
    kill_received = True
    # return (no more code)

信号可能会延迟到q.join()返回。即使信号立即传递; q.get()阻止您的子线程。它们会挂起,直到主线程退出。要解决这两个问题,你可以使用一个标记来表示没有更多工作的子进程,在这种情况下完全放弃信号处理程序:

def worker(stopped, queue, *args):
    for item in iter(queue.get, None): # iterate until queue.get() returns None
        if not stopped.is_set(): # a simple global flag would also work here
           upload_file(item, *args)
        else:
           break # exit prematurely
    # do child specific clean up here

# start threads
q = Queue.Queue()
stopped = threading.Event() # set when threads should exit prematurely
threads = set()
for _ in range(number_of_threads):
    t = Thread(target=worker, args=(stopped, q)+other_args)
    threads.add(t)
    t.daemon = True
    t.start()

# provide work
for item in file_list:
    q.put(item)
for _ in threads:
    q.put(None) # put sentinel to signal the end

while threads: # until there are alive child threads
    try:
        for t in threads: 
            t.join(.3) # use a timeout to get KeyboardInterrupt sooner
            if not t.is_alive():
               threads.remove(t) # remove dead
               break
    except (KeyboardInterrupt, SystemExit):
        print("got Ctrl+C (SIGINT) or exit() is called")
        stopped.set() # signal threads to exit gracefully

我已将value重命名为number_of_threads。我使用了显式线程集

如果个人upload_file()阻止,则该计划不会退出Ctrl-C

对于multiprocessing.Pool界面,您的案例似乎很简单:

from multiprocessing.pool import ThreadPool
from functools import partial

def do_uploads(number_of_threads, file_list, **kwargs_for_upload_file):
    process_file = partial(upload_file, **kwargs_for_upload_file)
    pool = ThreadPool(number_of_threads) # number of concurrent uploads
    try:
        for _ in pool.imap_unordered(process_file, file_list):
            pass # you could report progress here
    finally:
        pool.close() # no more additional work
        pool.join() # wait until current work is done

它应该正常退出Ctrl-C,即正在进行的上传可以完成,但新的上传不会开始。