Python - 线程程序似乎有内存泄漏?

时间:2010-08-19 19:08:27

标签: python multithreading debugging memory-management memory-leaks

我正在编写一个似乎在泄漏内存的python程序。

该程序接收URL列表并确保其状态代码为200.如果状态代码不是200,则脚本将通过电子邮件提醒我。脚本已经过线程化,因此URL可以相互并行检查。

我已经将程序设置为我们服务器的每5分钟运行一次的计划任务。从那时起,服务器的物理内存已被完全消耗。服务器运行的是Windows Server 2008和Python 2.6版。

内存泄漏在哪里?

以下代码调用线程类UrlChecker.py(也包含在下面):

    from ConfigParser import ConfigParser
    import re

    from UrlChecker import UrlCheckerThread
    from Logger import Logger
    from classes.EmailAlert import EmailAlert

    ... {More Code is here} ...

    urls = cfg.items('urls')

    defaulttimeout = int(cfg.get('timeout', 'default', 0))

    threadList = []

    for name, url in urls:
        m = re.search("\([0-9]*\)", name)          
        s = m.start() + 1
        e = m.end() - 1
        name = name[s:e]

        checker = UrlCheckerThread(url, name)
        threadList.append(checker)
        checker.start()

    for threads in threadList:
        threads.join()

    for x in threadList:
        status = x.status
        url = x.url
        name = x.name
        runtime = x.runtime

        """
        If there is an error, put information in a dict for furher
        processing. 
        """

        if(status != None and status != 200 or runtime >= defaulttimeout):
            self.logDict[name]= (name, url, status, runtime)

UrlChecker.py

import socket
from threading import Thread, Lock
from urllib2 import Request, urlopen
from ConfigParser import ConfigParser
from TimeoutController import TimeoutController
from classes.StopWatch import StopWatch

class UrlCheckerThread(Thread):
lock = Lock()
threadId = 0

def __init__(self, url, name):
    Thread.__init__(self)
    self.url = url
    self.name = name
    self.cfg = ConfigParser()
    self.cfg.read('c:\Websites\ServerManager\V100\webroot\Admin\SiteMonitor\config.cfg')
    self.thisId = UrlCheckerThread.threadId
    self.extendedTimeout = int(self.cfg.get('timeout', 'extended', 0))
    self.tc = TimeoutController()
    self.tc.setTimeout(self.extendedTimeout)
    UrlCheckerThread.threadId += 1

def run(self):
    """
    getHeader uses urlopen to check wether an website is online or not
    """
    self.sw = StopWatch()
    self.sw.start()
    self.checker = UrlChecker()
    UrlCheckerThread.lock.acquire()
    self.status = self.checker.getStatus(self.url)
    self.sw.stop()
    self.runtime = self.sw.time()
    """
    if(isinstance(self.status, socket.timeout)):
        self.tc.setTimeout(self.extendedTimeout)
        self.status = self.checker.getStatus(self.url)
        if(self.status == 200):
            self.status = 'short time out'
        self.tc.setTimeout(self.defaultTimeout)
    """
    UrlCheckerThread.lock.release()

class UrlChecker:

def getStatus(self, url):
    """
    getHeader uses urlopen to check wether an website is online or not
    """
    request = Request(url, None)
    try:
        urlReq = urlopen(request)

        """
        getcode() return the HTTP status header, which should be 200
        in most cases.
        """
        return urlReq.getcode()
    except IOError, e:
        if hasattr(e, 'reason'):
            """
            e.reason returns an IOError object, which cannot be just
            inserted in the database. The IOError object is basically
            a 2-Tuple with an errornumber and an errorstring.
            Since an errornumber is less readable then a string,
            we use e.reason.strerror to just return IOError's string
            """
            return e.reason.strerror
        elif hasattr(e, 'code'):
            """
            e.code is an int object, which is perfectly fine to insert in
            the database. So no further modification needed.
            """
            return e.code

谢谢!

1 个答案:

答案 0 :(得分:0)

您正在尝试为每个线程打开配置文件,这需要一些内存 您要检查多少个网址?
什么是ConfigParser实施。
你确定每个线程都加入了吗? 批量程序是否在下次计划运行之前完成?