如何检测django应用程序中的死锁(并摆脱它们)

时间:2012-01-10 14:54:40

标签: python django

我正在维护一个django项目,该项目定期没有响应。到目前为止,我通过持续监视应用程序并在必要时重新启动apache来处理这种情况。

反应迟钝怎么样?这意味着apache不再回复任何请求。

环境:

  • OS:Debian Squeeze 64bit
  • 网络服务器:Apache 2.2.16 mod_wsgi(mod_python已投入生产约一年)
  • Django:1.3.1(以及自1.0以来的每个主要版本)
  • Python:2.6.6 + virtualenv(使用分发版,无网站包,之前有几种不同的设置正在制作中)
  • 数据库后端:psycopg2 2.3.2
  • 数据库:PostgreSQL 9.0(过去使用的是8.3版本)
  • 连接池:pgbouncer(如果不使用保镖,问题仍然存在)
  • 反向代理:nginx 1.0.11

如何更接近错误的根源,我该怎么办? (我不能不提供源代码 - 这里有片段,但有可能) 我已经找到了这个问题很久以至于无法列出我尝试过的所有东西。我试图摆脱我能想到的任何“魔力”。自问题发生以来,应用程序的几个部分已被重写。

我很抱歉缺乏细节,但我很乐意提供(几乎)所要求的任何信息,并承诺尽最大努力使这篇文章对面临类似问题的其他人有所帮助。

2 个答案:

答案 0 :(得分:2)

最终,您需要添加到mod_wsgi 4.0的新功能。这些将允许守护进程模式在请求阻止时更好地控制自动重启。在阻塞条件下重新启动时,mod_wsgi将尝试为每个Python请求线程当时正在执行的操作转储Python堆栈跟踪,以便您可以看到它们被阻止的原因。

建议您在mod_wsgi邮件列表中处理该问题,并在需要时更详细地解释新功能。之前已发布过:

http://groups.google.com/group/modwsgi/msg/2a968d820e18e97d

mod_wsgi 4.0代码目前仅在源代码存储库中提供。目前的后备箱头部被认为是稳定的。

答案 1 :(得分:1)

您可能会受到以下django错误[1]的攻击(尚未在1.4分支中修复)

解决方法:manuall将fix应用于您的django源,或者使用围绕wsgi模块的线程安全包装器,如下所示(我们在生产系统中使用它)

from __future__ import with_statement
from  django.core.handlers.wsgi import WSGIHandler as DjangoWSGIHandler

from threading import Lock

__copyright__ = "Jibe"

class WSGIHandler(DjangoWSGIHandler):
    """
    This provides a threadsafe drop-in replacement of django's WSGIHandler.

    Initialisation of django via a multithreaded wsgi handler is not safe.
    It is vulnerable to a A-B B-A deadlock.

When two threads bootstrap django via different urls you have a change to hit 
the following deadlock.

  thread 1                                               thread  2
    view A                                                  view B
     import file foo            import lock foo               import file bar  import lock bar
           bootstrap django     lock AppCache.write_lock
                import file bar import lock bar  <-- blocks
                                                                 bootstrap django    lock AppCache.write_lock  <----- deadlock

workaround for an AB BA deadlock:  wrap it in a lock C.

        lock C                      lock C
            lock A                      lock B
            lock B                      lock A
            release B                   release A
            release A                   release A
        release C                   release C          

    Thats exactly what this class does,  but... only for the first few calls.  
    After that we remove the lock C.  as the AppCache.write_lock is only held when django is booted. 

    If we would not remove the lock C after the first few calls, that would make the whole app single threaded again. 

    Usage:    
        in your wsgi file replace   the following lines 
                import django.core.handlers.wsgi.WSGIHandler  
                application = django.core.handlers.wsgi.WSGIHandler 
        by 
                import threadsafe_wsgi 
                application = threadsafe_wsgi.WSGIHandler 


    FAQ: 
        Q: why would you want threading in the first place ?                 
        A: to reduce memory. Big apps can consume hundeds of megabytes each.  adding processes is then much more expensive than threads. 
           that memory is better spend caching, when threads are almost free. 

        Q: this deadlock, it looks far-fetched, is this real ? 
        A: yes we had this problem on production machines. 
    """ 
    __initLock = Lock()  # lock C 
    __initialized = 0 

    def __call__(self, environ, start_response): 
        # the first calls (4) we squeeze everybody through lock C 
        # this basically serializes all threads 
        MIN_INIT_CALLS = 4 
        if self.__initialized < MIN_INIT_CALLS: 
            with self.__initLock: 
                ret = DjangoWSGIHandler.__call__(self, environ, start_response) 
                self.__initialized += 1 
                return ret 
        else: 
            # we are safely bootrapped, skip lock C 
            # now we are running multi-threaded again 
            return  DjangoWSGIHandler.__call__(self, environ, start_response)

并在wsgi.py中使用以下代码

from threadsafe_wsgi.handlers import WSGIHandler
django_handler = WSGIHandler()

[1] https://code.djangoproject.com/ticket/18251