为什么线程会在我的python程序中偶尔退出

时间:2015-07-14 00:05:35

标签: python multithreading python-3.x

程序使用函数thread_monitor创建一个监视器线程。然后监视器线程创建数百个具有fucntion的踏板:valid_proxy和一个线程来反击每秒活动线程的数量。

但是经过几个小时后,我发现活跃的数量正在缩小,例如500到472。运行时间越长,减少的次数越多。

我不知道函数valid_proxy有什么问题导致异常线程存在。你能帮我指出潜在的错误吗?

所有代码都放在:https://github.com/iaston/proxy_checker中。这是一些代码片段。

import threading
import time
import requests
import requests.exceptions
import requests.adapters
import datetime
import queue
import re

def update_gui():
    while True:
        time.sleep(1)
        num_of_threads = 0
        for each_thread in threading.enumerate():
            if each_thread.name.find("Verify_Proxy_") == 0:
                num_of_threads += 1                 
        print("\nNumber of running threads is %d.\n" % num_of_threads)


def get_a_proxy():
    global g_b_stop, g_all_statu, g_proxy_queue, lock_get_proxy
    if g_b_stop:  
        return ""
    proxy_now = ""
    lock_get_proxy.acquire()
    try:
        while proxy_now == "":
            proxy_now = re.sub("[^\d:\.].+", "", str(g_proxy_queue.get(block=False)))
    except Exception:
        proxy_now = ""
    lock_get_proxy.release()
    return proxy_now


def valid_proxy(check_site_info, success_try):
    global lock_valided_list, g_tree_proxies, g_all_statu, proxies_valided_list
    i_error_limit = success_try[1] - success_try[0]
    if i_error_limit < 0:
        i_error_limit = 0
    i_error_now = 0
    proxy_now = get_a_proxy()
    while proxy_now != "":
        test_num = 0
        for each_check_site in check_site_info:  

            i_error_now = 0
            proxy_speed_recorder = {each_check_site: []}
            for iCounter in range(0, success_try[1]):  
                test_num += 1
                if (datetime.datetime.now() - g_gui_last_update_time).seconds > g_gui_update_interval:
                    redraw_gui_event_finished.clear()
                redraw_gui_event_finished.wait()  
                try:
                    read_timeout = int(check_site_info[each_check_site]['timeout'])
                    connect_timeout = read_timeout / 2
                    if connect_timeout < 2:
                        connect_timeout = 2
                    start_test_time = time.time()
                    req_headers = {
                        "User-Agent": "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0",
                        "Referer": check_site_info[each_check_site]['url']}
                    req_result = requests.get(check_site_info[each_check_site]['url'],
                                              timeout=(connect_timeout, read_timeout),
                                              proxies={'http': proxy_now, 'https': proxy_now}, headers=req_headers)
                    used_time_seconds = (time.time() - start_test_time) * 1000

                    html_result = req_result.text
                except Exception as e:
                    used_time_seconds = -1
                    html_result = ""
                    print("\nError in proxy %s:\n%r" % (proxy_now, e))
                if html_result.find(check_site_info[each_check_site]['keyword']) < 0:
                    i_error_now += 1
                    if i_error_now > i_error_limit:
                        print(('\nInvalided proxy: ' + proxy_now))
                        break  
                else:
                    proxy_speed_recorder[each_check_site].append((iCounter, used_time_seconds))

                    if iCounter + 1 - i_error_now >= success_try[0]:
                        break
            if i_error_now > i_error_limit:
                break  
        print("Proxy: " + proxy_now + " test number:" + str(test_num))
        if i_error_now <= i_error_limit:

            all_used_time = 0
            all_test_time = 0
            for each_check_site in proxy_speed_recorder:
                for each_test in proxy_speed_recorder[each_check_site]:
                    all_used_time += each_test[1]
                    all_test_time += 1
            avarge_time = round(all_used_time / all_test_time) if all_test_time != 0 else 0
            lock_valided_list.acquire()
            proxies_valided_list.append((proxy_now, avarge_time))
            try:
                g_all_statu["text_proxy_valid_append"] += proxy_now + "&" + str(avarge_time) + "\n"  
                g_tree_proxies.add_data_treeview([(proxy_now, avarge_time)], skip_datebase=True)  
            except Exception as e2:
                print("""g_all_statu["text_proxy_valid_append"] is wrong:\n""" + repr(e2))
            lock_valided_list.release()
        time.sleep(1)
        proxy_now = get_a_proxy()
    print("Finish, thread exit")


def thread_monitor(check_site_info, success_try, th_num):
    for each_proxy in proxies_unvalided_list:
        g_proxy_queue.put(each_proxy)

    th_gui = threading.Thread(target=update_gui)
    th_gui.setDaemon(True)
    th_gui.start()

    ths_verify = []
    for iCounter in range(0, th_num):
        t = threading.Thread(target=valid_proxy, args=(check_site_info, success_try))
        t.setDaemon(True)
        t.setName("Verify_Proxy_" + str(iCounter))
        t.start()
        ths_verify.append(t)

    for iCounter in range(0, th_num):
        ths_verify[iCounter].join()


thread_monitor(check_site_info, success_try, 500)

1 个答案:

答案 0 :(得分:0)

我无法对您的函数valid_proxy进行故障排除,因为它主要引用您未提供的代码,而且我并非透视。但是,我可以告诉你,如果在函数中的任何地方发生异常,它将导致线程退出。如果你这样做:

def forever_valid_proxy(*x,**y):
    try:
         valid_proxy(*x,**y)
    except Exception:
         pass

并替换该行:

t = threading.Thread(target=valid_proxy, args=(check_site_info, success_try))

使用:

t = threading.Thread(target=forever_valid_proxy, args=(check_site_info, success_try))

线程数永远不会减少。无论这是你真正想要或需要的程序,我都不知道,但你永远不会看到线程数量减少。