Python中的异步HTTP调用

时间:2011-02-10 21:17:34

标签: python asynchronous asyncore

我需要在Python中使用回调类功能,我会多次向Web服务发送请求,每次都会更改参数。我希望这些请求同时发生而不是顺序发生,所以我希望异步调用该函数。

看起来像asyncore是我可能想要使用的,但我看到它的工作原理的例子看起来都有点过分,所以我想知道是否还有另一条道路我应该走下去。关于模块/流程的任何建议?理想情况下,我想以程序方式使用它们而不是创建类,但我可能无法绕过它。

4 个答案:

答案 0 :(得分:16)

从Python 3.2开始,您可以使用concurrent.futures启动并行任务。

查看此ThreadPoolExecutor示例:

http://docs.python.org/dev/library/concurrent.futures.html#threadpoolexecutor-example

它产生线程来检索HTML并在收到响应时对其做出反应。

import concurrent.futures
import urllib.request

URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://europe.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://some-made-up-domain.com/']

# Retrieve a single page and report the url and contents
def load_url(url, timeout):
    conn = urllib.request.urlopen(url, timeout=timeout)
    return conn.readall()

# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    # Start the load operations and mark each future with its URL
    future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        try:
            data = future.result()
        except Exception as exc:
            print('%r generated an exception: %s' % (url, exc))
        else:
            print('%r page is %d bytes' % (url, len(data)))

以上示例使用线程。还有一个类似的ProcessPoolExecutor使用进程池而不是线程:

http://docs.python.org/dev/library/concurrent.futures.html#processpoolexecutor-example

import concurrent.futures
import urllib.request

URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://europe.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://some-made-up-domain.com/']

# Retrieve a single page and report the url and contents
def load_url(url, timeout):
    conn = urllib.request.urlopen(url, timeout=timeout)
    return conn.readall()

# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    # Start the load operations and mark each future with its URL
    future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        try:
            data = future.result()
        except Exception as exc:
            print('%r generated an exception: %s' % (url, exc))
        else:
            print('%r page is %d bytes' % (url, len(data)))

答案 1 :(得分:16)

你知道eventlet吗?它允许您编写看似同步代码的内容,但让它在网络上异步操作。

以下是超级最小抓取工具的示例:

urls = ["http://www.google.com/intl/en_ALL/images/logo.gif",
     "https://wiki.secondlife.com/w/images/secondlife.jpg",
     "http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif"]

import eventlet
from eventlet.green import urllib2

def fetch(url):

  return urllib2.urlopen(url).read()

pool = eventlet.GreenPool()

for body in pool.imap(fetch, urls):
  print "got body", len(body)

答案 2 :(得分:9)

Twisted framework只是门票。但是如果你不想接受它,你也可以使用pycurl,libcurl的包装器,它有自己的异步事件循环并支持回调。

答案 3 :(得分:0)

(虽然这个帖子是关于服务器端的Python。因为这个问题有一段时间被问过了。其他人可能偶然发现他们在客户端寻找类似答案的地方)

对于客户端解决方案,您可能需要查看Async.js库,尤其是“Control-Flow”部分。

https://github.com/caolan/async#control-flow

通过将“平行”与“瀑布”相结合,您可以获得所需的结果。

WaterFall(并行(TaskA,TaskB,TaskC) - > PostParallelTask​​)

如果您检查Control-Flow下的示例 - “Auto”,它们会为您提供上述示例: https://github.com/caolan/async#autotasks-callback “write-file”取决于“get_data”和“make_folder”,“email_link”取决于写文件“。

请注意,所有这些都发生在客户端(除非你在服务器端做Node.JS)

对于服务器端Python,请查看PyCURL @ https://github.com/pycurl/pycurl/blob/master/examples/basicfirst.py

通过将下面的示例与pyCurl相结合,您可以实现非阻塞的多线程功能。

希望这会有所帮助。祝你好运。

Venkatt @ http://MyThinkpond.com