Question

目前我有一个非常烦人的问题。当我用一个太大的页面来处理httplib2.request时，我希望能够干净利落地停止它。

例如：

from httplib2 import Http
url = 'http://media.blubrry.com/podacademy/p/content.blubrry.com/podacademy/Neuroscience_and_Society_1.mp3'
h = Http(timeout=5)
h.request(url, 'GET')

在此示例中，网址是一个播客，它会一直下载。在这种情况下，我的主要流程将无限期挂起。

我尝试使用此代码在单独的线程中设置它并直接删除我的对象。

def http_worker(url, q):
    h = Http()
    print 'Http worker getting %s' % url
    q.put(h.request(url, 'GET'))

def process(url):
    q = Queue.Queue()
    t = Thread(target=http_worker, args=(url, q))                    
    t.start()
    tid = t.ident
    t.join(3)
    if t.isAlive():              
        try:
            del t            
            print 'deleting t'
        except: print 'error deleting t'
    else: print q.get()

    check_thread(tid)

process(url)

不幸的是，该线程仍处于活动状态，并将继续消耗cpu / memory。

def check_thread(tid):
    import sys
    print 'Thread id %s is still active ? %s' % (tid, tid in sys._current_frames().keys() )

谢谢。

Answer 1

好的，我发现黑客能够处理这个问题。

到目前为止，最好的解决方案是设置读取的最大数据并停止从套接字读取。从httplib模块的_safe_read方法读取数据。为了覆盖这个方法，我使用了这个lib：http://blog.rabidgeek.com/?tag=wraptools

瞧：

 from httplib import HTTPResponse, IncompleteRead, MAXAMOUNT
 from wraptools import wraps
 @wraps(httplib.HTTPResponse._safe_read)
 def _safe_read(original_method, self, amt):
     """Read the number of bytes requested, compensating for partial reads.

     Normally, we have a blocking socket, but a read() can be interrupted
     by a signal (resulting in a partial read).

     Note that we cannot distinguish between EOF and an interrupt when zero
     bytes have been read. IncompleteRead() will be raised in this
     situation.

     This function should be used when <amt> bytes "should" be present for
     reading. If the bytes are truly not available (due to EOF), then the
     IncompleteRead exception can be used to detect the problem.
     """
     # NOTE(gps): As of svn r74426 socket._fileobject.read(x) will never
     # return less than x bytes unless EOF is encountered.  It now handles
     # signal interruptions (socket.error EINTR) internally.  This code
     # never caught that exception anyways.  It seems largely pointless.
     # self.fp.read(amt) will work fine.
     s = []
     total = 0
     MAX_FILE_SIZE = 3*10**6
     while amt > 0 and total < MAX_FILE_SIZE:
         chunk = self.fp.read(min(amt, httplib.MAXAMOUNT))
         if not chunk:
             raise IncompleteRead(''.join(s), amt)
         total = total + len(chunk)
         s.append(chunk)
         amt -= len(chunk)
     return ''.join(s)

在这种情况下，MAX_FILE_SIZE设置为3Mb。

希望这会有所帮助。

如何在httplib2请求太长时关闭它

1 个答案: