异步pycurl请求处理python初学者

时间:2014-11-30 22:40:36

标签: python curl asynchronous pycurl

我尝试合并program A

的异步功能

使用program B

启用超级简单的基于字符串的逻辑
#pseudocode 
    label beginning
    sleep(10)
    if substring in someString:
        print "It's not happening!!!"
        goto beginning 

摘录2:

 #unique verification variable automatically gets generated every request 
 c.setopt(pycurl.HTTPHEADER, ['verification: ' + verification ])

基本上,如果第一次请求响应html没有返回特定的字符串。必须在10秒后发送具有相同验证码的请求。这一切都必须以不接触硬盘(仅存储器)的方式异步发生,因此可以用1k>执行。每秒请求数。

玄妙以某种纯洁的恋物癖的名义缺乏goto,这让我的头脑在解决这个问题上受到了伤害。

重心似乎围绕着这些函数:c.setopt(pycurl.WRITEDATA,)vs c.setopt(pycurl.WRITEFUNCTION,)m = pycurl.CurlMulti()m.handles.append(c)

欢迎任何关于如何最好地解决这个难题的建议。 我主要寻找的可能是伪代码/逻辑的一般性+对我应该研究的函数的一些建议,一旦我有了一般的蓝图,我应该能够将它拼凑在一起。

1 个答案:

答案 0 :(得分:0)

from StringIO import StringIO

import pycurl

class CurlStream(object):
    """"""
    curl_count = 0
    curl_storage = []

    def __init__(self):
        self.curl_multi = pycurl.CurlMulti()

    def add_request(self, request, post_fields=None):
        self.curl_count += 1
        curl = self._create_curl(request, post_fields)
        self.curl_multi.add_handle(curl)

    def perform(self):
        while self.curl_count:
            while True:
                response, self.curl_count = self.curl_multi.perform()
                if response != pycurl.E_CALL_MULTI_PERFORM:
                    break
            self.curl_multi.select(1.0)

    def read_all(self):
        for response in self.curl_storage:
            print response.getvalue() # this does nothing --prints blank lines

    def close(self):
        self.curl_multi.close()

    def _create_curl(self, request, post_fields):
        curl = pycurl.Curl()
        curl.setopt(curl.URL, request)
        curl.setopt(curl.WRITEFUNCTION, self.write_out) # now passing own method
        curl.setopt(curl.TIMEOUT, 20)
        # Below is the important bit, I am now adding each curl object to a list
        self.curl_storage.append(curl)
        return curl

    def write_out(self, data):
        print 'Data len', len(data)
        print data
        return len(data)


def main():
    curl_stream = CurlStream()
    curl_stream.add_request('http://www.google.com')
    curl_stream.add_request('http://www.tomdickin.com')
    curl_stream.perform()
    curl_stream.read_all()
    curl_stream.close()

if __name__ == '__main__':
    main()

How can I get the response body from pycurl multi curl requests

来自该答案的代码似乎不错,只有当我在完成它应该做的事情之后在最后运行它时才能工作

Traceback (most recent call last):
  File "Untitled 2.py", line 55, in <module>
    main()
  File "Untitled 2.py", line 53, in main
    curl_stream.read_all()
  File "Untitled 2.py", line 28, in read_all
    print response.getvalue() # this does nothing --prints blank lines
AttributeError: getvalue